Aleph Week #2
Bayesian statistics; Bagging and Boosting in ML; the Geometry of Social Distancing; 2020 in Math and CS; Multimodal AI; ML theory with bad drawings and +
From the web:
>> The Year in Math and Computer Science. Even as mathematicians and computer scientists proved big results in computational complexity, number theory and geometry, computers proved themselves increasingly indispensable in mathematics.
>> The immense potential and challenges of multimodal AI. I always repeat myself on this, a core concept of my personal AI vision: super power in AI will come from the combination of smaller subsystems into a complex macrosystem. This article points in that direction.
>> AIs that read sentences are now catching virus mutations. NLP algorithms designed for words and sentences can also be used to interpret genetic changes in viruses—speeding up lab work to spot new variants.
>> Someone took the time to make a game based on the year 2020.
>> The Math of Social Distancing is a Lesson in Geometry. How to safely reopen offices, schools and other public spaces while keeping people six feet apart comes down to a question mathematicians have been studying for centuries.
>> The Architect of Modern Algorithms. Barbara Liskov pioneered the modern approach to writing code. She warns that the challenges facing computer science today can’t be overcome with good design alone.
>> ML Theory with bad drawings.
>> Julia, a Python Challenger? The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. In 2020, the number of downloads jumped 87 percent to more than 24 million (2020 v. 2019) and the number of available packages rose 73 percent to roughly 4800.
>> How to get promoted. By Slava.
>> Use of Clearview AI facial recognition tech spiked as law enforcement seeks to identify Capitol mob. The company’s CEO said use of its tech was up 26 percent the day after the January 6th attack.
Seen in GitHub:
>> Best of ML with Python. “This curated list contains 820 awesome open-source projects with a total of 2.6M stars grouped into 32 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers.”
From reddit:
>> Datasets should behave like Git repositories.
>> What is the most interesting public dataset you know of?
Learn something:
>> Bagging and Boosting basics. How to combine weaker learners to build a stronger system.
>> Best Practices for ML Engineering. This document is intended to help those with a basic knowledge of machine learning get the benefit of Google's best practices in machine learning. It presents a style for machine learning, similar to the Google C++ Style Guide and other popular guides to practical programming.
>> Intro to Bayesian Statistics. A very nice YouTube playlist; “provides a complete introduction to the field of Bayesian statistics. It assumes very little prior knowledge and, in particular, aims to provide explanations of concepts with as little math as possible”.
>> Intro tutorial to Customer Lifetime Value calculation. Using LifeTime library in Python.
Tools and repos:
>> LIFETIME. Interesting Python library to analyze lifetime of agents (customers?). In particular, can be used to calculate Customer Lifetime Value, Churn and other relevant parameters.
You can contact me in Twitter and LinkedIn, or write me an email to gualterio at gmail.