Automated Democracy Scores. Brazilian Review of Econometrics, 37(1), 31-43, 2017. In this paper I use natural language processing to create the first machine-coded democracy index, which I call Automated Democracy Scores (ADS). I base the ADS on 42 million news articles from 6,043 different sources. The ADS cover all independent countries in the 1993-2012 period. Unlike the democracy indices we have today the ADS are replicable and have standard errors small enough to actually distinguish between cases. (I also wrote a related paper where I try a bunch of other methods - LSA, LDA, Random Forest.) Data and code I created a web app that lets anyone tweak the training data and see how the results change - without having to write any code. If you do want to see the gory details, here’s what you need to know.
Deep Learning Anomaly Detection as Support Fraud Investigation in Brazilian Exports and Anti-Money Laundering (with Ebberth Paula, Marcelo Ladeira, and Rommel Carvalho). 15th IEEE International Conference on Machine Learning and Applications (ICMLA), 2016. Here we use deep learning to detect fake Brazilian exports. Data and code Sorry, it’s company-level data and therefore protected by Brazilian privacy laws (only had access to it because co-author works at Brazil’s tax authority.)
A dimensão geográfica das eleições brasileiras (“The spatial dimension of Brazilian elections”). Opinião Pública (Public Opinion), 19(2), 270-290, 2013. Here I use spatial econometrics and the Brazilian election of 2010 to understand why neighboring counties tend to vote similarly. Data and code. I used a mix of Stata (here) and R (here) code. The dataset is here (it’s in Stata format; convert it to CSV format to run the R code). The list of missing observations is here. (To produce the plots I used GeoDa and ArcGIS, using the respective GUIs, so there’s no code for those.)
Lobby e protecionismo no Brasil contemporâneo (“Lobby and protectionism in Brazil”). Revista Brasileira de Economia (Brazilian Review of Economics), 62(3), 263-178, 2008. Here I regress tariffs on industry-level indicators of political power (economic concentration, number of workers, etc). Data and code. I ran everything almost a decade ago and back then I used Excel spreadsheets to store data (I know, I know…) and I clicked buttons instead of writing code (I didn’t know any better), so I don’t have much to offer here. The spreadsheets are all in this zipped folder.
Using SVM to pre-classify government expenditures (2015). Here I use support vector machines (SVM) to create an app that could reduce misclassification of government purchases in Brazil. The app suggests likely categories based on the description of the good being purchased. Data and code. Download and decompress the CSV files and save them all in the same folder. Then use the scripts parseX.py and parseY.py to create X.pkl and Y.pkl respectively (I know, I could simply let you download X.pkl and Y.pkl directly but you should not trust Python pickles you didn’t create yourself. And the pickles take up a lot more space than the CSVs.) Then use the catmat_svm.py script to train and validate the classifier. As for the web app, it’s open source.
Ideological bias in democracy measures (2012). Here I use Monte Carlos to reassess some studies on the biases behind the Freedom House, Polity IV, etc. I find that the evidence of bias is robust but that we can’t know which measures are biased or in what direction (e.g., for all we know the Freedom House may as well have a leftist bias, contrary to popular belief). Data and code. I used a mix of Stata (here and here) and R (here) code. Here’s the data in Stata format and here’s the same data in CSV format (for the R code).
Why is democracy declining in Latin America? (2011). Here I argue that Latin America’s “left turn” in the 2000s was accompanied by democratic erosion, as the new governments that came to power relied on constituencies that did not value democracy (which in turn reduced the electoral cost of suppressing press freedom, violating term limits, etc). Data and code. Here’s the Stata do file and here’s the dta file.
O terceiro fracasso do Mercosul (“The third failure of Mercosur”). O Estado de São Paulo, 2/5/2011. Here I discuss why Mercosur failed to lock in the trade liberalization of the 1990s.
O preço de aceitar a Venezuela (“The price of accepting Venezuela”). O Estado de São Paulo, 5/28/2009. Here I discuss the policy consequences of Venezuela’s entry into Mercosur (a trade bloc comprising Brazil, Argentina, Paraguay, Uruguay, and Venezuela).