Delivered once every Week. No Spam Ever.

Issue - 172

Worthy Read

Docker and Kubernetes provide the platform for organizations to get software to market quickly. In this webinar, you will get a practical guide in designing a Docker based CD pipeline on Kubernetes with GoCD.

Join the TensorFlow team as they kick off the 2018 TensorFlow Dev Summit! The TensorFlow Dev Summit brings together a diverse mix of machine learning users from around the world for a full day of highly technical talks, demos, and conversations with the TensorFlow team and community.

I’ve used Python’s textblob classifier to simply classify issues according to assignees from their description and headers. Classified issues used to classify newly created issues and results are recorded to a database. 2019 issues used as training set and %82 assignment accuracy have been achieved. As the training set grows bigger accuracy could be better.
text classification

interview questions

A survey of 9,500 developers shows what Python programmers use and what they work on. See how typical you are as a Python developer

Nice curated list.

I frequently predict proportions (e.g., proportion of year during which a customer is active). This is a regression task because the dependent variables is a float, but the dependent variable is bound between the 0 and 1. Googling around, I had a hard time finding the a good way to model this situation, so I’ve written here what I think is the most straight forward solution.

Clustering data is the process of grouping items so that items in a group (cluster) are similar and items in different groups are dissimilar. After data has been clustered, the results can be analyzed to see if any useful patterns emerge. For example, clustered sales data could reveal which items are often purchased together (famously, beer and diapers).

In this article, we will go through the evaluation of Topic Modelling by introducing the concept of Topic coherence, as topic models give no guaranty on the interpretability of their output. Topic modeling provides us with methods to organize, understand and summarize large collections of textual information. There are many techniques that are used to obtain topic models. Latent Dirichlet Allocation (LDA) is a widely used topic modeling technique to extract topic from the textual data.
topic modeling

PyTorch 1.0 takes the modular, production-oriented capabilities from Caffe2 and ONNX and combines them with PyTorch's existing flexible, research-focused design to provide a fast, seamless path from research prototyping to production deployment for a broad range of AI projects.

Since I moved to Amsterdam I’m biking to work almost every morning. And as Google is always tracking the location of my phone, I thought that it might be interesting to do something with that data.

Recently I did some PostgreSQL consulting in the Berlin area (Germany) when I stumbled over an interesting request: How can data be shared across function calls in PostgreSQL? I recalled some one of the other features of PostgreSQL (15+ years old or so) to solve the issue. Here is how it works.

So you have a dataset and you’re about to run some test on it but first, you need to check for normality. Think about this question, “Given my data … if there is a deviation from normality, will there be a material impact my results?”
data science

Ever just wanted to download a bunch of subtitles to check which one fits the video? Subscene got everything, but it can be tedious to download subtitles one by one.

Given 4 assets’ risk and return as following, what could be the risk-return for any portfolio built with the assets. One may think that all possible values should fall inside the area. But it is possible to go beyond the bond, because combining inversely correlated assets can construct a portfolio with lower risk.

Basic graph representation function on top of networkx graph library.

My goal in this post is simply to share how we at YouVersion are leveraging machine learning tools to generate product recommendations.


mass_archive - 68 Stars, 11 Fork
A basic tool for pushing a web page to multiple archiving services at once.

sublime_black - 39 Stars, 0 Fork
Sublime Text package to format python code using black formatter.

pinboard-backup - 5 Stars, 0 Fork
This backs up Pinboard bookmarks to DynamoDB.

pipenv-pipes - 5 Stars, 1 Fork
Pipes - PipEnv Environment Switcher

cognises - 4 Stars, 1 Fork
Flask Cognises: AWS Cognito group based authentication with user management

gsync - 3 Stars, 0 Fork
Simple PyDrive wrapper and command line tool.

Chinese_models_for_SpaCy - 3 Stars, 0 Fork
Models for SpaCy that support Chinese

pyjson5 - 3 Stars, 0 Fork
A JSON5 serializer and parser library for Python 3 written in Cython.

Palette_Bot - 3 Stars, 0 Fork
A Reddit bot that generates a color palette for images it is called upon

countryinfo - 3 Stars, 3 Fork
A python module for returning data about countries, ISO info and states/provinces within them.

desktop-entry-creator - 2 Stars, 1 Fork
A user-friendly GUI for creating desktop entries for installed applications on Linux