List of Datasets
All of these datasets are for research and analysis.
Twitter
-
Top 400 Twitch Accounts With Twitter Handles (twitch,twitter,type,domain)
-
Top 1000 Celebrity Accounts (twitter,domain,name,type)
-
Top 1000 Sports Accounts (twitter,domain,name,type)
-
Top 600 Brand Accounts (twitter,domain,name,type)
-
Top 1000 Company Accounts (domain,name,keywords,description,twitter)
-
Small Dataset to classify Band or Organization from Twitter descriptions.
Names
-
Dataset of 2,400 black (African American) female names for NLP training and analysis. The names have been retrieved from US public inmate records.
(last name, first name,gender,race)
-
Dataset of 50,000 black (African American) male names for NLP training and analysis. The names have been retrieved from US public inmate records. (last name, first name,gender,race).
-
Dataset of ~40,000 white (Caucasian) male names for NLP training and analysis. The names have been retrieved from US public inmate records. (last name, first name,gender,race).
-
Dataset of ~4,500 white (Caucasian) female names for NLP training and analysis. The names have been retrieved from US public inmate records. (last name, first name,gender,race).
-
Dataset of ~4,000 hispanic male names for NLP training and analysis. The names have been retrieved from US public inmate records. (last name, first name,gender,race)
-
Dataset of ~200 hispanic female names for NLP training and analysis. The names have been retrieved from US public inmate records. (last name, first name,gender,race).
-
Dataset of ~14,000 Indian male names for NLP training and analysis. The names have been retrieved from public records. (name,gender,race)
-
Dataset of ~14,000 Indian female names for NLP training and analysis. The names have been retrieved from public records. (name,gender,race)