machine learning podcast
04 Jan 2015Here.
Here, by John Myles White.
tl;dr:
The primary problem with statistical computing in Julia is that the current tools were all designed to emulate R. Unfortunately, R’s approach to statistical computing isn’t amenable to the kinds of static analysis techniques that Julia uses to produce efficient machine code.
And here, by Dan Luu.
tl;dr:
It’s not unusual to run into bugs when using a young language, but Julia has more than its share of bugs for something at its level of maturity. If you look at the test process, that’s basically inevitable. […] Not only are existing tests not very good, most things aren’t tested at all.
The choice of an efficient document preparation system is an important decision for any academic researcher. To assist the research community, we report a software usability study in which 40 researchers across different disciplines prepared scholarly texts with either Microsoft Word or LaTeX. The probe texts included simple continuous text, text with tables and subheadings, and complex text with several mathematical equations. We show that LaTeX users were slower than Word users, wrote less text in the same amount of time, and produced more typesetting, orthographical, grammatical, and formatting errors. On most measures, expert LaTeX users performed even worse than novice Word users. LaTeX users, however, more often report enjoying using their respective software. We conclude that even experienced LaTeX users may suffer a loss in productivity when LaTeX is used, relative to other document preparation systems. Individuals, institutions, and journals should carefully consider the ramifications of this finding when choosing document preparation strategies, or requiring them of authors.
The discovery of causal relationships from purely observational data is a fundamental problem in science. The most elementary form of such a causal discovery problem is to decide whether X causes Y or, alternatively, Y causes X, given joint observations of two variables X, Y . This was often considered to be impossible. Nevertheless, several approaches for addressing this bivariate causal discovery problem were proposed recently. In this paper, we present the benchmark data set CauseEffectPairs that consists of 88 different “causeeffect pairs” selected from 31 datasets from various domains. We evaluated the performance of several bivariate causal discovery methods on these real-world benchmark data and on artificially simulated data. Our empirical results provide evidence that additive-noise methods are indeed able to distinguish cause from effect using only purely observational data. In addition, we prove consistency of the additive-noise method proposed by Hoyer et al. (2009).