Anirudh (Andy) Kamath

Data Scientist. CS/Business Student.

About Me


Education: Northeastern University (2017-2022)

Course of Study: Computer Science/Business Combined Major, Finance Concentration, Computational Data Analytics Minor


I'm from Charlotte, NC. I have a keen interest in applied data analytics in biomedicine and/or finance. I founded CounteractIO, which is an ML company focused on building cool stuff. Outside of school, I'm have an affinity to ancient languages, picking up both Latin and Sanskrit in high school. I spent a semester abroad in Thessaloniki, Greece at the American College of Thessaloniki in the Fall of 2017. During this time, I was able to do some awesome things, most notably climb Mount Olympus and visit beautiful places like Athens, Berlin, Strasbourg, and Corfu.

Work

Data Intern Summer 2018

Responsible for generating optimal paths to ship from a seller to a verification checkpoint to the buyer for $2M worth of goods on a daily basis.

Research Assistant Spring 2018

Responsible for data pipelining, visualization, and analysis of eye gaze tracking in the Interdisciplinary Affective Science Laboratory

Projects

ColoBoost

Colon Cancer Screening - Faster and Better

(2017) - Earlier this year, a close relative of mine was diagnosed with late-stage colon cancer. Seeing her go through the treatment process, I witnessed firsthand the inefficiencies of the current screening process. It's the most preventable cancer, yet the second most fatal. The good screening tests are expensive and take two weeks, and the faster/cheaper screening tests are inaccurate. I wanted to do something about this, so I used machine learning and conditional probability to combine health choices of an individual with a cheap biomarker test to build a combined test that's as accurate as the expensive one, with the speed and inexpensiveness of the cheaper one. My end result is 60x cheaper and 4,000x faster than current benchmarks like Cologuard, and once we finish clinical validation trials, this will prove to be a much more efficient way of diagnosing people early, when rates of survival are >94%, than in stage 4, when the survival rates are less than 10%.

Read more:
Write-Up Website

Gllass

Image analytics for social media

(2016-17) - Last year, a friend and I built a platform to highlight popular and common features on a user's Instagram pictures so they can see what to post more in order to garner more attention to their profile. On top of this, we built a "like prediction" system that can work with just a 10% margin of error. This year, I added on to this by building a whole caption generation system.


In order to do this, I used the VGG16 pre-trained convolutional neural network model to generate tags, then ran them through to see which tags were associated with higher likes, and returned those as the most popular tags. Once that was complete, I vectorized all the tags associated with each image, and used it to train a random forest regression model. Once a new image was uploaded to have its likes predicted and a caption generated, the image was run through the CNN again and its tags were once again vectorized. I trained an RNN on encouraging quotes, then ran a beam search to generate a paragraph based on the tags generated by the CNN to make the caption relevant to the post. Since the RNN was incoherent, I combined its output with a Markov chain trained on the same encouraging quotes dataset in order to generate a caption that has some meaning. The tags were then put through the trained RF Regression model to generate a like prediction.

Read more:
Caption Generation Demo

DR Screening

ML for Retinopathy Screening

(2017) - This was the first ML project I did entirely on my own. I built off an existing project by Antal and Hajdu at the University of Debrecen in order to create a more sensitive model for diabetic retinopathy screening. In order to do so, I took their algorithms for AM-FM based feature detection given an image of the retina to extract features such as microaneurysms, lesions, and exudates in order to predict the retinopathy stage with 97% sensitivity.


In order to do this, I used ensemble methods composed of an artificial neural network, a logistic regression classifier, and a linear kernel support vector machine.

Read more:
Blog Post

Counteract

Risk Analysis Platform for Social Media Threats

(2017) - I jumped onto this project after a friend of mine won a hackathon with a basic prototype. Counteract helps detect people on Twitter who may be at risk of harming themselves or others. If a tweet is found to be of negative intentions, a happy/encouraging response tweet is generated back by Counteract.


In order to do this, I used a Tweet stream tool to constantly look for terms such as "suicide" or "kill" or other harshly negative terms. I ran each tweet through a hierarchical attention network to test whether or not it had any malintent. If this was found, a Markov chain was used to then generate a response tweet to the original user.

Read more:
GitHub

View my GitHub

Awards

Undergraduate Finalist

Top 20 in undergraduate research out of ~350 projects for development of ColoBoost

Unite for Humanity Grand Award

For development of Counteract

Excellence in Computer Science

For research into retinopathy screening.

National Finalist

For development of Counteract.

Silver Medal

For research into retinopathy screening.

View full resume

Get In Touch


Shoot me an email at andy@andykamath.com or use the form below!