Table of Contents
Note that this material will eventually all be moved to https://www.violet-mica.com/wiki/index.php?title=Main_Page
Public datasets
- Sherlock Holmes texts
- Enron emails
- https://www.figure-eight.com/data-for-everyone/ (Figure Eight, formerly known as “CrowdFlower”)
- https://www.kaggle.com/datasets (Kaggle is a data science and machine learning community and competition platform, bought by Google around March 2017)
Security
Organizations Surrounding AI
- OpenAI (https://openai.com/): “Discovering and enacting the path to safe artificial general intelligence.”
- Council on Extended Intelligence (https://globalcxi.org/): “CXI was created to proliferate the ideals of responsible participant design, data agency and metrics of economic prosperity prioritizing people and the planet over profit and productivity.” (https://www.wired.com/story/a-plea-for-ai-that-serves-humanity-instead-of-replacing-it/)
Learning Materials
- Free online class for the Introduction to Statistical Learning book, by the authors of the book, the same authors who wrote The Elements of Statistical Learning.
- Google Free Machine Learning Crash Course with TensorFlow APIs https://developers.google.com/machine-learning/crash-course/
- DataQuest for learning Data Science, Data Analysis, or Data Engineering (https://www.dataquest.io/home)
Everything past this point is only somewhat organized. It could use re-organization.
Most of the stuff on here is semi-organized machine learning and deep learning links/articles/resources.
Data Skeptic Podcast https://dataskeptic.com (all things skepticism, statistics, probability, machine learning, and AI)
Data Visualization
https://www.reddit.com/r/dataisbeautiful/ “for visualizations that effectively convey information.”
https://www.reddit.com/r/dataisugly/ “all about butchered visualizations, misleading charts and unlabelled axes.”
Newsletters
http://transmission.ai/
http://digest.deeplearningweekly.com/
Fundamentals
Implementing machine learning algorithms from scratch https://machinelearningmastery.com/start-here/
Sample walk-through on implementing a deep reinforcement learning model from reading a journal article: http://amid.fish/reproducing-deep-rl
This is how I really began to understand Neural Networks: Hacker’s guide to Neural Networks by Andrej Karpathy (manually implement a basic neural network to understand what’s under the hood of frameworks like tensorflow/torch/etc)
word2vec skip-gram model http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/
How neural networks learn nonlinear functions and classify linearly non-separable data by Vivek Yadav
Yes you should understand backprop by Andrej Karpathy
AI heuristics: https://heuristicswiki.wikispaces.com/
Free Learning Resources
http://neuralnetworksanddeeplearning.com/ by Michael Nielsen
CS231n Convolutional Neural Networks for Visual Recognition by Andrej Karpathy et al.
http://www.deeplearningbook.org/ by Ian Goodfellow, Yoshua Bengio, Aaron Courville
UC Berkeley CS188 Intro to AI Course Materials, which uses AIMA 3ed (Artificial Intelligence: A Modern Approach) by Stuart Russell and Peter Norvig
Reddit
https://www.reddit.com/r/learnmachinelearning/ https://www.reddit.com/r/MLQuestions/
https://www.reddit.com/r/MachineLearning/
FAQ and Link-Collection of reddit’s /r/MachineLearning/, where they have a list of MOOCs, books, and tons of other resources.
https://www.reddit.com/r/datasets/
StackExchange CrossValidated (statistics)
Improving at AI/ML/Data Science
http://blog.kaggle.com/2018/05/07/profiling-top-kagglers-bestfitting-currently-1-in-the-world/
Blogs
http://www.cleverhans.io/ Ian Goodfellow and Nicolas Papernot
Paid Learning Resources
https://www.udacity.com/course/deep-learning-nanodegree-foundation–nd101
Pre-trained models
Caffe Model Zoo github and documentation
Where to go to find out what the current best-performing image recognition architecture is, according to Andrej Karpathy: IMAGENET Large Scale Visual Recognition Challenge (ILSVRC)
Demos
sense2vec using word2vec and spaCy by Andrew Trask et. al, 2015
Browse Arxiv journal articles more pleasantly, courtesy Andrej Karpathy:
http://www.arxiv-sanity.com/
Note that it currently (2017-04-15) only displays the sections of Andrej Karpathy’s research interests, specifically AI/ML and related, but you can fork the github project and maybe configure it for other sections.
Sonnet (github repository) is a higher-level framework built around TensorFlow, released by Deepmind and introduced in their blog post. “We’ve found that writing code which explicitly represents submodules allows easy code reuse and quick experimentation – Sonnet promotes writing modules which declare other submodules internally, or are passed other modules at construction time.”
Distill is meant to replace PDF for research publication format, offering interactivity instead of a static page
Here’s an article introducing an explaining Distill: https://www.hpcwire.com/2017/03/22/google-launches-new-machine-learning-journal/
Here’s an example of an article published in Distill: http://distill.pub/2016/augmented-rnns/
Here’s another
Deep Learning Conferences & Events aggregated by Deep Learning Weekly
Deep Learning Best Practices:
https://medium.com/intuitionmachine/infographic-best-practices-in-training-deep-learning-networks-b8a3df1db53
Link to Google search for [machine OR deep learning standards OR “best practices”]
Un-vetted but interesting-looking articles:
Unsupervised sentiment neuron – learn sentiment by being trained to predict the next character in the text of Amazon reviews
Deep Learning Part 3: Combining Deep Convolutional Neural Network with Recurrent Neural Network
Computer Science Overviews:
https://btholt.github.io/four-semesters-of-cs/
http://bigocheatsheet.com/
The Data Structures and Algorithms book: https://mitpress.mit.edu/books/introduction-algorithms
Data Science as a Service (“Data Science On-Demand.”): https://tresl.co/
How to see what a website is built with: builtwith.com
Building a computer: pcpartpicker.com
Standards
- Standards on how to structure data on the web in order to help friendly bots traverse the web in order to help out humans: https://schema.org/: “Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond.”
Using technology to make the world a better place
-
- Find a local neighbor who can take your compost, or let others know that you can take their compost, at ShareWaste (https://sharewaste.com/share-waste)
How to affordably access computational power for machine learning:
- How to do machine learning on AWS Spot Instances: (I found these by googling
machine learning spot instances
)
Potentially Scary uses of Machine Learning
Shopping
Amazon:
- Price watches and email notifications: camelcamelcamel.com
- Export/generate reports regarding orders/items: https://www.amazon.com/gp/b2b/reports
Extensive reviews for the best tools and home products in every category: https://thewirecutter.com and their sister site http://thesweethome.com/
Recent Comments