For the next few months, I will be working Aletheia, a comprehensive AI safety platform that automatically tests, monitors, and evaluates AI models for alignment issues across multiple dimensions: truthfulness, helpfulness, harmlessness, and value alignment with continuous safety monitoring, red-teaming automation, and interpretability insights to ensure responsible AI deployment at scale.
I’m planning to utilize FastAPI and Python for the backend to call AI models with Together AI API calls, and React for a frontend web development.
I envision users selecting an AI model of their choice where the platform will automatically conduct safety tests (including jailbreaks, red-teaming, etc) to produce a comprehensive report on the model’s safety.
Today, I started out with installing relevant dependancies for Together AI API calls and building a simple technical architecture
pip3 install together openai requests
First, I mounted my model back on to my colabs run time because colabs wipes everything out when a session ends.
After that, I developed a script a test the model across 10 examples. It performd horribly with an average similarity score of 20%.
I knew that the model itself wasn’t the problem due to previous comprehensive debugging so I examined my training data and found an issue.
| The training dataset had chuncks of meaning less text and a few of them repeated “< | endoftext | >” more than a hundred time. Since I couldn’t clean up the training data due to its arbitrary manner of meaningless text, I retrained the model on a smaller dataset with better organization (python_code_instructions_18k_alpaca) |
The training went well and it only took 17 minutes since we only had 18k data. But, this also means that the model wouldn’t perform as well.
After that, I tested the model on 28 comprehensive examples, and it performed significantly better althought it’s similarity rates were still low.
In the end, this was a training data issue. I tried looking for larger datasets with better structuring but I find any that were open source. So, this will be the end of this project and I won’t be deploying my model to hugging face.
Lessons-leared:
Now I’ll move on to my next project on building a technical application for comprehensive alignment testing
]]>Colabs
Today, I managed to get a free Colabs Pro subscription through their student verification process, meaning that I have access to better GPUs.
Instead of using T4 (which I constant had an issue with), I decided to use A100 to reduce my training duration significantly.
I revised my training script to optimize it for the A100 GPU and utilize most of 40GB of GPU provided. So, I set the script to use 0.95 (95%) of the GPU provided.
Setting Up for Training
Then, I set up the GPT-2 124 parameter model, its tokenizer, and my special tokens to prepare for training.
As usual, I initialzed Weights and Biases to mointor the training loss, validation loss, and learning rate.
After that, I loaded my full tokenized datasets and printed out how many I had for tracking purposes.
With all of that set, I wrote my training arguments to fully utilize the A100 GPU’s capabilities.
Training Results
I successfully ran the script and the training process only took around 2.5 hours, which is a significant reduction from ~5 hours using the T4 GPU.
Looking at my wandb panels, it’s clear that my training process went well without any issues because you can see that both the training and validation losses increased with more datasets, meaning that model became more accurate. Also, the decay seen in the learning rate showcases improved convergence towards an optimized solution
Moving On
I will work on creating simple test to evaluate the quality of the model’s output and eventually test it on all the codes in my testing dataset.
If the model performs with at least a 80% accuracy, I will deploy it on hugging face and use API calls to make a simple web application for use.
]]>Colabs
Today, I continued to optimize my training script in order to reduce the time it takes to train my gpt 2 model.
First Attempt:
I also tried enabling TensorFloat-32, but I noticed that it wasn’t supported on the T4 GPU.
With those changes, I ran the code but it returned a “Out of Memory” Error, meaning that CUDA was out of memory.
In order to accomodate those limits, I reduced the batch sizes back to 12, but left everything else constant.
In the end, I was able to bring down the training time to around 5 hours, whiling using 90-95% of the T4 GPU.
Moving On, if I can’t reduce the time further, I will just train my model as it is.
]]>For the past month, I worked on creating a new website for the Technology Policy Society at JHU to better capture the scope and impact of the organization’s work.
Since this is purely a frontend project, I started out with bootstrapping Create React App.
npm install react-bootstrap bootstrap
Then, I created a basic structure for my multipage website that utilizes react-router-dom.
components/
├── Navigation.tsx
├── MainPage.tsx
├── ProjectPage.tsx
├── ResourcePage.tsx
├── TeamPage.tsx
├── TeamCards.tsx
├── Footer.tsx
└── and CSS files for all components
Navigation Bar
I created a sticky navigation bar with a simple menu list that contains “HOME,” “PROJECTS,” “RESOURCES,” “TEAM”
Main Page
Projects Page
Resources Page
Team Page
Footer
The website is currently deployed through vercel
Feel free to check it out: https://tps-jhu.vercel.app
]]>Today, I worked on deploying my web application to Vercel. Although I finished working on this project around a month ago, I wanted to publically deploy it so that it is visible to everyone.
I initially thought of deploying it on github but after doing some research I realized that github can only host static pages. Since my web application had API calls, that wasn’t an option. So I moved on to Vercel.
Vercel Deployment
First of all, I got rid of the homepage dependency on my package.json file in my frontend and the /AI-Risk-Auditor root on App.tsx since I wasn’t going to deploy on github anymore.
Then, I created a docker file to run my backend on a render server since I was using Java for my backend.
After that, I deployed my backend service on render, and it gave me a primary public url I can use for Vercel.
With that publich url, I made an enviornment variable so that my API can be called from render instead of localhost 8080.
I made according changes to my api.ts file so that it utilizes the enviornment variable I previously set.
Everything was set and I deployed my web application on vercel, but when I tried to run the risk assessment, it returned a “Network Error”. So I went into insepct -> Network and it showed me a CORS error. To fix this issue, I changed my Cross Origins to allow vercel depolyment and added a global CORS configuration just in case.
Thankfully, everything worked out now and I wasn’t getting any errors from running the risk assessment. You can check the web application on https://ai-risk-auditor.vercel.app/home
]]>Full Training Script Development
Today, I worked on developing a script to train gpt 2 on all of the tokenized data, and did some debugging to expedite the process.
The code itself is largely similar with test-training.py. I just tweaked some details and removed the load_small_subset function. I first set up wandb to track the model’s training progress, set up the model, and load the full dataset.
Then, I set the training arguements so that the learning rate is set to 5e-5 (lowest rate) and all of parameters set to low figures so that the training process doesn’t overload my local cpu.
I kept the model temperature the same (0.7) as the testing trial since I’m still dealing with significantly less data compared to industry models.
I ran the code and everything seemed to work. But, it was going to take +200 hours. I did some testing in Jupyterlabs to see if I could optimize the training process but it only brought the time down to 170 hours. Instead of using my computer’s local cpu, I decided to migrate into Colabs since they provide a free T4 GPU.
Colabs
I first mounted the colab notebook onto my drive, cloned my repository to gain access to my scripts, and installed all required libraries.
After that, I checked the status of the GPU to ensure that it is working properly.
Then, I ran my full training code and it gave me 10 hours which was a significant improvement. But 10 hours was still insanely long. So checked the GPU usage and it was only showing 4.9/15.0 GB. Thus, I did some more optimization to fully harness the GPU. I increased the batch size so that the training process is more efficient, and it brought down the time to 7 hours.
I also tweaked the learning rate to 1e-4 and 3e-4 for fine-tuning, but it didn’t make much of a difference.
Moving on, I will work on further optimizing in order to train my model for efficiently.
]]>Today, I worked on setting up my GPT-2 model and tested training it with a small sample to ensure that everything works before full training.
GPT-2 Set Up
I set up the model by loading it and its pre-trained tokenizer. The model will run on my computer’s cpu because I don’t have a CUDA (GPU) available at the moment.
Then, I created a simple input format to train the model with one example.
Thankfully, the model produced the expected outcome without any errors, giving me the green light to move on to pre-training.
Test Training
First, I loaded a small subset (100 examples) of data to train my gpt-2 model.
Then, I loaded the model and added two special tokenizers (padding and separator) to ensure that the formatting is uniform.
For the training arguements, I set the epoch to 1 and learning rate to 5e-5 (which is the slowest rate) since the model is running on my computer’s cpu.
After creating the trainer, I moved onto testing the output extraction. I used a low temperature for this test so that the model is more concise dealing with a small subset of data.
I also had this test to be ran on wandb to visualize how my model is being trained. The downward slope of both the eval and train loss graphs indicate that the model is generalizing well to unseen data and fits the training data well.
That’s it for today, and I will work on full training moving on.
]]>Today, I worked on coding scripts to test the data quality, clean the data, tokenize the data for GPT-2, and finally test the tokenized data.
Step 1: Testing Data Quality
I start by checking the data’s basic structure in order to eliminate any basic low-quality data (e.g. empty data)
After that, I check the quality of the code by verifying whether the code is sytactically valid, has functions, has classes, has imports, and whether they are too long or too short. I define too short as < 2 and too long as > 50 for code length.
Then, I check the quality of each code’s explaination by parsing for common terms.
Finally, I generate a report for each split so that it’s easy to access on my repo.
Step 2: Cleaning Data
After testing the data quality, I clean the data by first removing invalid examples and cleaning/formatting the code & explainations.
Step 3: Tokenizing Data
With all the data clean, I prepare them for tokenization for the GPT-2 model. I tokenize everydata set and make them compatible for Hugging Face and GPT-2
I also statistically analyze the token lengths for understand the size of the data I’m dealing with.
Repeat
These three steps are repeated for each data split (train, test, validation)
Example, step 1 for the test split:
Example, step 2 for the test split:
Example, step 3 for the test split:
Testing Tokenized Data Quality
After tokenizing all the data, I created another test to verify the quality of the tokenized data.
I also check for data consistency across the different data splits.
Additionally, I created a visial representation of the data quality of all splits.
Fortunately, most of the data was valid, meaning that I have a good chunck of data to train GPT-2
Notes I pulled all the raw data pushed on my github repo to keep my repo lightweight and respect data licenses. Also, people can simply access the raw data by running my load-data.py script.
]]>To start with this project, I began setting up my code enviornment by downloading necessary libraries.
pip3 install transformers datasets torch accelerate wandb
pip3 install pandas numpy matplotlib seaborn
pip3 install requests zipfile gzip
Then, I created a project structure to organize this model, data, outputs, etc
code-explanation-model/
├── data/
├── models/
├── scripts/
├── notebooks/
└── outputs/
Now that I setup the basics, I wrote a script to pull python code data from CodeXGLUE on Hugging Face. Since, it is nearly impossible for me to manually create all the data to train and fine tune my model, I decided to use a public dataset. I simply use the load_dataset method to pull CodeXGLUE data from Hugging Face.
Then, I created a function to explore the format of dataset to understand the data that I’m dealing with
After that, I format the data for GPT-2 training and save them into a JSON file.
Finally, I created a three-way split for the data (train, validation, test) so that I can train the model on the majority of the data, validated it through different examples, and test it with unseen data for real world performance. It’s a method to prevent overfitting.
By running the code, I was able to format and save 251820.
Notes I had to install Git Large File Storage since I had too many data to directly commit to my git. Also, it’s common practice to use git lfs for ML projects.
Now that I’m done with data processing, I will move on to working on: