Planet dgplug

August 03, 2020

Jason Braganza (Work)

A Hundred Days of Code, Day 026 - Refactoring

Worked only an hour today.
Trying to change the little lookup program, I made the other day, into something a little better.

Not quite a good day.
Calling it quits early.


by Mario Jason Braganza at August 03, 2020 07:30 AM

A Hundred Days of Code, Day 025 - Comprehension Exercises

Working on Comprehension exercises today.

Reflections and Experience

Read more… (1 min remaining to read)

by Mario Jason Braganza at August 03, 2020 07:27 AM

August 01, 2020

Kushal Das

Use DoH over Tor for your Qubes system

I was using my dns-tor-proxy tool in the AppVMs in my QubesOS system. But, at the same time I was trying to figure out how to make it the default DNS system for the whole Qubes.

ahf provided me a shell script showing how he is forwarding the DNS requests to a VPN interface. I modified the same so that all of standard DNS queries become DoH queries over the Tor network.

Setting up sys-firewall

In the following example, I am setting up the sys-firewall service VM. All other AppVMs connected to this VM as netvm will be use dns-tor-proxy without any modification.

Make sure that the template for sys-firewall has the latest Tor installed. You can get it from the official Tor repository.

Download (or build) the latest dns-tor-proxy 0.3.0 release, and put the file (as executable) in /rw/config/ directory.

Next, modify the /rw/config/rc.local file & add the following lines.

systemctl start tor
sh /rw/config/dns.sh
/rw/config/dns-tor-proxy --doh &

As you can see, we are executing another script at /rw/config/dns.sh, which has the following content. Remember to modify the DNS value to the right IP for your sys-firewall vm.


#!/bin/sh

QUBES_DNS_SERVERS="10.139.1.1 10.139.1.2"
DNS=10.137.0.x

# accept DNS requests from the other vms

iptables -I INPUT -i vif+ -p udp --dport 53 -j ACCEPT
iptables -I INPUT -i vif+ -p tcp --dport 53 -j ACCEPT

# Clean up our NAT firewall rules.
iptables --flush PR-QBS --table nat

# We take incoming traffic on TCP and UDP port 53 and forward to
# our DNS server.
for QUBES_DNS_SERVER in ${QUBES_DNS_SERVERS} ; do
    iptables --append PR-QBS --table nat --in-interface vif+ --protocol tcp --destination "${QUBES_DNS_SERVER}" --dport 53 --jump DNAT --to-destination "${DNS}":53
    iptables --append PR-QBS --table nat --in-interface vif+ --protocol udp --destination "${QUBES_DNS_SERVER}" --dport 53 --jump DNAT --to-destination "${DNS}":53
done

# Log *other* DNS service connections. This part is optional, but ensures that
# you can monitor if one of your VM's is making any traffic on port 53 with
# either TCP or UDP. If you want to log *every* DNS "connection", including the
# ones to QUBES_DNS_SERVERS, you can either move these commands up before the
# for-loop in this file or change the --apend option to be an --insert instead.
iptables --append PR-QBS --table nat --in-interface vif+ --protocol tcp --dport 53 --jump LOG --log-level 1 --log-prefix 'DNS Query: '
iptables --append PR-QBS --table nat --in-interface vif+ --protocol udp --dport 53 --jump LOG --log-level 1 --log-prefix 'DNS Query: '

Now, restart your sys-firewall vm. And you are all set for your DNS queries.

August 01, 2020 02:30 PM

July 20, 2020

Farhaan Bukhsh

Link Tray

TLDR; Link Tray is a utility we recently wrote to curate links from different places and share it with your friends. The blogpost has technical details and probably some productivity tips.

Link Bubble got my total attention when I got to know about it, I felt it’s a very novel idea, it helps to save time and helps you to curate the websites you visited. So on the whole, and believe me I am downplaying it when I say Link Bubble does two things:

  1. Saves time by pre-opening the pages
  2. Helps you to keep a track of pages you want to visit

It’s a better tab management system, what I felt weird was building a whole browser to do that. Obviously, I am being extremely naive when I am saying it because I don’t know what it takes to build a utility like that.

Now, since they discontinued it for a while and I never got a chance to use it. I thought let me try building something very similar, but my use case was totally different. Generally when I go through blogs or articles, I open the links mentioned in a different tab to come back to them later. This has back bitten me a lot of time because I just get lost in so many links.

I thought if there is a utility which could just capture the links on the fly and then I could quickly go through them looking at the title, it might ease out my job. I bounced off the same idea across to Abhishek and we ended up prototyping LinkTray.

Our first design was highly inspired by facebook messenger but instead of chatheads we have links opened. If you think about it the idea feels very beautiful but the design is “highly” not scalable. For example if you have as many as 10 links opened we had trouble in finding our links of interest which was a beautiful design problems we faced.

We quickly went to the whiteboard and put up a list of requirements, first principles; The ask was simple:

  1. To share multiple links with multiple people with least transitions
  2. To be able to see what you are sharing
  3. To be able to curate links (add/remove/open links)

We took inspiration from an actual Drawer where we flick out a bunch of links and go through them. In a serendipitous moment the design came to us and that’s how link tray looks like the way it looks now.

Link Tray

Link Tray was a technical challenge as well. There is a plethora of things I learnt about the Android ecosystem and application development that I knew existed but never ventured into exploring it.

Link Tray is written in Java, and I was using a very loosely maintained library to get the overlay activity to work. Yes, the floating activity or application that we see is called an overlay activity, this allows the application to be opened over an already running application.

The library that I was using doesn’t have support for Android O and above. To figure that out it took me a few nights 😞 , also because I was hacking on the project during nights 😛 . After reading a lot of GitHub issues I figured out the problem and put in the support for the required operating system.

One of the really exciting features that I explored about Android is Services. I think I might have read most of the blogs out there and all the documentation available and I know that I still don't know enough. I was able to pick enough pointers to make my utility to work.

Just like Uncle Bob says make it work and then make it better. There was a persistent problem, the service needs to keep running in the background for it to work. This was not a functional issue but it was a performance issue for sure and our user of version 1.0 did have a problem with it. People got mislead because there was constant notification that LinkTray is running and it was annoying. This looked like a simple problem on the face but was a monster in the depth.

Architecture of Link Tray

The solution to the problem was simple stop the service when the tray is closed, and start the service when the link is shared back to link tray. Tried, the service did stop but when a new link was shared the application kept crashing. Later I figured out the bound service that is started by the library I am using is setting a bound flag to True but when they are trying to reset this flag , they were doing at the wrong place, this prompted me to write this StackOverflow answer to help people understand the lifecycle of service. Finally after a lot of logs and debugging session I found the issue and fixed it. It was one of the most exciting moment and it help me learn a lot of key concepts.

The other key learning, I got while developing Link Tray was about multi threading, what we are doing here is when a link is shared to link tray, we need the title of the page if it has and favicon for the website. Initially I was doing this on the main UI thread which is not only an anti-pattern but also a usability hazard. It was a network call which blocks the application till it was completed, I learnt how to make a network call on a different thread, and keep the application smooth.

Initially approach was to get a webview to work and we were literally opening the links in a browser and getting the title and favicon out, this was a very heavy process. Because we were literally spawning a browser to get information about links, in the initial design it made sense because we were giving an option to consume the links. Over time our design improved and we came to a point where we don’t give the option to consume but to curate. Hence we opted for web scraping, I used custom headers so that we don’t get caught by robot.txt. And after so much of effort it got to a place where it is stable and it is performing great.

It did take quite some time to reach a point where it is right now, it is full functional and stable. Do give it a go if you haven’t, you can shoot any queries to me.

Link to Link Tray: https://play.google.com/store/apps/details?id=me.farhaan.bubblefeed

Happy Hacking!

by fardroid23 at July 20, 2020 02:30 AM

July 18, 2020

Bhavin Gandhi

GNU Emacs pretest builds for Fedora

I have been following GNU Emacs development through Debbugs and Sacha Chua’s newsletter. I always felt that I should use the latest development version of Emacs, instead of sticking to stable release. That way I get to use the latest improvements and also help in testing the changes. If I find any bugs, I can report those. The motivation for building pretests I was planning to build RPM packages for Fedora from master branch.

by Bhavin Gandhi (bhavin192@removethis.geeksocket.in) at July 18, 2020 07:22 AM

July 17, 2020

Robin Schubert

Holiday Greetings

I'm on vacation at the North Sea with my family, and like exactly one year ago I was facing the problem of having too many postcards to write. Last year, I had written a small Python script that would take a yaml file and compile it to an HTML postcard.

The yaml describes all adjustable parts of the postcard, like the content and address, but also a title, stamp and front image. A jinja2 template, a bit of CSS and javascript create a flipable postcard that can be sent via email - which is very convenient if you, like me, are too lazy to buy postcards and stamps, and have more email addresses in your address book than physical addresses.

A postcard yaml could look like this (click the card to flip it around):

---

- name: Holiday Status 2020
  front_image: 'private_images/ninja.jpg'
  address: |
    Random Reader
    Schubisu's Blog
    World Wide Web
  title: I'm fine, thanks :)
  content: |
    Hey there!
    I'm currently on vacation and was stumbling over the same problem I had last year; writing greeting cards for friends and family. Luckily I've solved that issue last year, I simply had totally forgotten about it.
    This is an electronic postcard, made of HTML, CSS and a tiny bit of javascript, compiling my private photos and messages to a nice looking card.
    Feel free to fork, use and add whatever you like!
    Greets,
    Schubisu
  stamp: 'private_images/leuchtturm_2020.jpg'

and will be rendered by the script to this:

I was curious anyway, how this would be rendered on my blog. I've added a small adjustment to my CSS to scale the iframe tag by 0.75% and I'm okay with the result ;)

Write your own postcard or add some features! You can find the repository here: https://gitlab.com/schubisu/postcard.

by Robin Schubert at July 17, 2020 12:00 AM

July 16, 2020

Farhaan Bukhsh

Debugging React Native

Recently I have been trying to dabble with mobile application development. I wanted to do something on a cross platform domain, so mostly I am being lazy where I wanted to write the application once and make it work for both iOS and Android. I had a choice to choose between Flutter which is a comparatively new framework and React Native which has been here for a while. I ended up choosing React Native and thanks to Geeky Ants for Native Base the development became really easy.

The way I am approaching developing my application is, by a Divide and Conquer strategy, where I have divide the UI part of the application into different component and I am developing one component at a time. I am using StoryBook for achieving the same. This is a beautiful utility which you should check out. Here we can visualise each small component and see how it will render.

While developing this application I was constantly facing the issue of asking myself , “Did this function got called?” or “What is the value of this variable here?”. So going back to my older self I would have put a print statement or as in JavaScript put a console.log on it.(Read it like Beyoncé‘s put a ring on it.)

Having done that for sometime, I asked myself; Is there a better way to this . And the answer is “YES!”, there is always a better way we just need to find out. Now enters the hero of our story, “Debugger” there are various operations that I can perform like to put up break points – conditional and non conditional and analyse our code.

Let me quickly walk you through this, so VS Code has a react native plugin that has to be configured, to use in our project. Once that is done we are almost ready to use it. I faced few issues while getting it to work initially and so I thought having some pointers upfront might ease out the development for other people.

Before I go into deeper details, just a preview of how a debugger looks like on ReactNative with VS Code.

Debugger In Action

So let’s get to the meat of it and how to get it to work. First and foremost we need to download and install React Native Tools from VS Code extensions.

Once that is done we are are good to go, on the side bar you can see that there is a Debug button , when you click on it, VS code opens up a configuration file.

launch.json

There are various options you can play around but the one that worked the best for me was Attach to packager. Once the configuration is in, then we need start the packager, mostly it is done by npm start, but VS Code also provide an option in the action bar at the bottom.

Action Bar

Once the the packager is started we need to start our debugger, click on the Debug button on the sidebar and click on the Attach to packager icon. This will start your debugger from VS Code. Till now your action bar should be of blue color showing that the debugger has started but not active yet.

Then on the device where you are deploying the application you need to enable Debug Js and Vola! your debugger will be active. Debugger has helped me a lot. I could inspect each variable and see at what point of time what values it holds.

Or I can step into debugger and trace the flow of the control statement. I can even put a conditional break point and see when a condition is met or not.

Debuggers has helped me a lot to make my development go faster, hope this blog helps the readers too.

Happy Hacking!

by fardroid23 at July 16, 2020 04:11 AM

July 14, 2020

Kushal Das

Introducing pyage-rust, a Python module for age encryption

age is a simple, modern and secure file encryption tool, it was designed by @Benjojo12 and @FiloSottile.

An alternative interoperable Rust implementation is available at github.com/str4d/rage

pyage-rust is a Python module for age, this is built on top of the Rust crate. I am not a cryptographer, and I prefer to keep this important thing to the specialists :)

pyage-rust demo

Installation

I have prebuilt wheels for both Linux and Mac, Python 3.7 and Python 3.8.

python3 -m pip install pyage-rust==0.1.0

Please have a look at the API documentation to learn how to use the API.

Building pyage-rust

You will need the nightly rust toolchain (via https://rustup.rs).

python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install requirements-dev.txt
maturin develop && maturin build

Missing items in the current implementation

I am yet to add ways to use ssh keys or alias features, but those will come slowly in the future releases.

July 14, 2020 10:09 AM

July 06, 2020

Jason Braganza (Personal)

Tiny Habits

In a world where Atomic Habits, did not exist, I’d call Tiny Habits the best book on behavioural change and habit building.
Or maybe, I am biased because I read James Clear’s book first.
Just like Cal Newport took Anders Ericsson’s work and ran with it; so did James build on BJ Fogg’s.

Tiny Habits is lovely, has pretty tables and is a lovely engaging read.
If you want to change your behaviour, you simply cannot go wrong by learning from the man, who has taught many of our modern day influencers like Ramit Sethi or Instagram founder, Mike Kreiger.


by Mario Jason Braganza at July 06, 2020 01:52 PM

July 05, 2020

Nabarun Pal

Running Tor Proxy with Docker

Today I was testing dns-tor-proxy which required a SOCKS5 Tor proxy and realized I never ran a Tor service on my current machine. I use Tor browser almost daily for browsing websites I have absolutely no trust on, but not the standalone Tor proxy. In this article, I will try to set one up using the system package as well as inside a Docker container. What is a Tor proxy? A Tor proxy is a SOCKS5 proxy which routes your traffic through the Tor network.

July 05, 2020 10:15 AM

June 30, 2020

Nabarun Pal

It's always DNS!

Context I was running Airflow inside a Kubernetes cluster but the Airflow pods were not able to connect with the PostgreSQL database running inside the cluster. The following was consistently seen in the Airflow logs, although the postgres-airflow service was up and running. sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not translate host name "postgres-airflow" to address: Temporary failure in name resolution For the rest of this post, we will assume that all the user run components inside the cluster are running perfectly and focus on what is causing the name resolution errors.

June 30, 2020 12:00 AM

June 22, 2020

Anwesha Das

PyLadies India embarked its journey

PyLadies India embarked its journey

20th June 2020 marked a new beginning for PyLadies in India. We had our first meetup.

pyladiesindia1stmeet

I started my journey with PyLadies in 2016 as an organizer of PyLadies Pune. It began with a personal itch of the lawyer who wanted to learn Python. I revived the PyLadies Pune. We had meetups within our local limits, and everything was hunky-dory.

But PyCon India 2016 gave us the platform to peep into more larger picture. The Pythonistas in India, though divided by language, culture, geographical location nevertheless, our stories are similar. Men predominate the Indian Python community (then and so as now, unfortunately). And the community members who identify themselves as women found their place in that small PyLadies Pune booth. We shared our stories, our journey, and found out how similar they were. From the first day where I was going and telling who are we “PyLadies” and getting some not so good reaction. At the end of the conference, there was acceptance, respect, and recognition by the same people. We ended PyCon India 2016 in success with the initiation of one new chapter, PyLadies Delhi. And we realized united we stand. From then every PyCon India, we had at least a new chapter coming up. With a little bit of push, support, and help, I had some amazing ladies coming up, leading, and sharing the Pyladies baton in India. Now we have 8 + PyLadies chapters in India and many more to come. They make me feel happy, proud. But more than that, I think I have them to lean on, with whom the future of PyLadies India is safe and secure.

The chapters were running fine in their places and with a meeting every year in PyCon India. We were somewhat happy. But we felt the need for a united organization, with all the chapters being a part of it. It will be the face of PyLadies in India. A place where we will decide the future course of us, PyLadies in India. The talks were on for quite sometimes, but now the pandemic has allowed us to embark on this journey. In these times, it is a necessity to move all our meetups to an online platform. Also, it has given us every chapter to increase their reach and not be bound by the local limits.

But it came with a bunch of challenges to the chapter organizers -
a. to decide on the online platform,
b. increase the bandwidth of organizers to help set up the platform and run the online event,
c. it has opened up a lot more choice of sessions for PyLadies across India and Globe to attend and be a part of.

So in the above circumstances, we, different PyLadies chapters in India, have decided to come up with a single meetup. Each month all the chapters in India will put up a session together. But of course, if any chapter wants to have their meet up, they are free to do it. There is no rationing on the meetups. The only thing we will have to try to make sure that we do not collide on the dates. So here are we with the first Pyladies India session. It will not be an exaggeration to state that we owe this pandemic for this :)

The team got into planning and working for the same. Social media, Creatives, preparing the platform for the broadcast, av, and other settings. I, along with my amazing PyLadies Nancy, Sakshi, Vaishnavi, and Niharika, were busy and on our toes. The choice of my speaker for the first PyLadies India session was natural. We were unanimous the First Women C-Python Core Developer to be our speaker. All of us wanted Marriata. And I can not thank her enough for being ever helpful and saying yes to come to take the session, and all it took me to message her. We wanted the session introductory, so the people who are starting their journey with PyLadies India will be able to start and grow with us. So she suggested that for a tutorial on Github Bots and we readily agreed to the idea.

The plan for 20th was all set. I will introduce us, PyLadies India. Then Marriata will let us know what is happening in the field of Global PyLadies, she will take the session on Github Bot, and I will moderate the whole session. But my health (umm age) pulled my excitement down. I had to undergo gum and tooth surgery, which needed me not to talk. Then Niharika came into rescue. She gave the voice to my thoughts.

The session ended in all happy note. The people very well received Marriata’s session for being detailed and smooth.

We can not wait to meet again, follow us on Twitter, Instagram, and subscribe to our youtube channel. Some of us now may want to thank the pandemic for this opportunity :) Thank you, PyLadies, au revoir.

by Anwesha Das at June 22, 2020 09:36 AM

Robin Schubert

Python + pandas + matplotlib vs. R + tidyverse - a quick comparison

TL;DR

When I started this blog post, my intention was to compare basic data wrangling and plotting in Python and R. I write most of my code in Python, doing mostly scientific stuff, data wrangling, basic statistics and plots. However, although the pandas framework has been built to resemble R's data.frame class, the functional programming style that R provides with the tidyverse - or rather with the %>% operator of the magittr package - often give a way more natural feeling when handling data.

To be honest, I was sure, that the tidyverse packages would out-perform pandas; however, looking at the code snippets now, I must admit that the difference is less obvious than I previously thought. Still, in more complex situations, Python code - in comparison - looks more verbose, while R seems to be more concise and consistent.

Plotting may be a different story, but to make a fair comparison here, I would have to take the variety of plotting libraries into account that the Python ecosystem offers, however this would be far out of the scope of this post.

The ggplot2 package on the other hand, has been developed as an implementation of"Grammar of Graphics", a philosophy by Leland Wilkinson that divides visualizations into semantic components. This implementation provides an huge flexibility, as we'll see later.

Setting it up

Traditionally, the popular iris data set is used to demonstrate data wrangling, but in the given situation let's play around with the Johns Hopkins University COVID-19 data from github. As an example, the time series of global confirmed cases can be accessed here.

Let's set up our working environments and load the three data sets confirmed, recovered and deaths in R and Python.

R:

library(tidyverse)

base_url <- "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/"
confirmed <- read.csv(paste0(base_url, "time_series_covid19_confirmed_global.csv"))
recovered <- read.csv(paste0(base_url, "time_series_covid19_recovered_global.csv"))
deaths <- read.csv(paste0(base_url, "time_series_covid19_deaths_global.csv"))

head(confirmed)

##   Province.State      country.Region      Lat     Long X1.22.20 X1.23.20   ...
## 1                        Afghanistan  33.0000  65.0000        0        0
## 2                            Albania  41.1533  20.1683        0        0
## 3                            Algeria  28.0339   1.6596        0        0
## 4                            Andorra  42.5063   1.5218        0        0
## 5                             Angola -11.2027  17.8739        0        0
## 6                Antigua and Barbuda  17.0608 -61.7964        0        0
## ...

Python:

import pandas as pd

base_url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/"
confirmed = pd.read_csv(base_url + "time_series_covid19_confirmed_global.csv")
recovered = pd.read_csv(base_url + "time_series_covid19_recovered_global.csv")
deaths = pd.read_csv(base_url + "time_series_covid19_deaths_global.csv")

confirmed.head()

#   Province/State Country/Region      Lat     Long  1/22/20  1/23/20  1/24/20  ...
# 0            NaN    Afghanistan  33.0000  65.0000        0        0        0  ...
# 1            NaN        Albania  41.1533  20.1683        0        0        0  ...
# 2            NaN        Algeria  28.0339   1.6596        0        0        0  ...
# 3            NaN        Andorra  42.5063   1.5218        0        0        0  ...
# 4            NaN         Angola -11.2027  17.8739        0        0        0  ...

So far, so similar. When looking at the loaded data sets, we see that the format is kind of ugly. These are time series with names of states and countries in the first two columns, latitude and longitude coordinates in the third and forth, and one column per day for the following rest of the set, containing the numbers of confirmed / recovered / deaths reported for the country/region.

To make working with the data easier, I want to transform them to a long format, having a single column date with all the dates and a column confirmed (or recovered and deaths respectively) with the corresponding number of cases.

In R we could do the following:

confirmed_long <- confirmed %>%
  pivot_longer(
    c(-`Province.State`, -`Country.Region`, -Lat, -Long),
    names_to="date",
    values_to="confirmed") %>%
  mutate(date = as.Date(date, format = "X%m.%d.%y"))

head(confirmed_long)

##   Province.State Country.Region   Lat  Long date       confirmed
##   <fct>          <fct>          <dbl> <dbl> <date>         <int>
## 1 ""             Afghanistan       33    65 2020-01-22         0
## 2 ""             Afghanistan       33    65 2020-01-23         0
## 3 ""             Afghanistan       33    65 2020-01-24         0
## 4 ""             Afghanistan       33    65 2020-01-25         0
## 5 ""             Afghanistan       33    65 2020-01-26         0
## 6 ""             Afghanistan       33    65 2020-01-27         0

very similarly in Python:

confirmed_long = pd.melt(
    confirmed,
    id_vars = ["Province/State", "Country/Region", "Lat", "Long"],
    var_name = "date",
    value_name = "confirmed")\
    .assign(date = lambda x: pd.to_datetime(x.date))

confirmed_long.head()

#   Province/State Country/Region      Lat     Long       date  confirmed
# 0            NaN    Afghanistan  33.0000  65.0000 2020-01-22          0
# 1            NaN        Albania  41.1533  20.1683 2020-01-22          0
# 2            NaN        Algeria  28.0339   1.6596 2020-01-22          0
# 3            NaN        Andorra  42.5063   1.5218 2020-01-22          0
# 4            NaN         Angola -11.2027  17.8739 2020-01-22          0

As you can see, I took the opportunity to also parse the strings in the date column to datetime. While in R we can pipe the output of the previous command into the next with the %>% operator, we achieve a very similar affect using pandas' assign function.

We can do the same with the remaining two data sets. However, we have plenty of redundant information, since all three data sets will be the same apart from the confirmed, reconvered and deaths columns. So let's merge them into one data frame. Also I don't care about single provinces and states but rather want to sum up the data country-wise. This will mess up our latitudes and longitudes, so I just drop them here.

R:

covid_long <- confirmed_long %>%
  left_join(recovered_long) %>%
  left_join(deaths_long) %>%
  select(-Lat, -Long, -Province.State) %>%
  group_by(Country.Region, date) %>%
  summarise_all(sum)

Python:

covid_long = pd.merge(
    pd.merge(
        confirmed_long,
        recovered_long,
        how="left"),
    deaths_long,
    how="left"
)

We're ready to look at some plots. I would like to see a line plot that shows us the numbers of confirmed, recovered and death cases against the time, on a log transformed scale. Different countries should be coded in colors, the confirmed, recovered and deaths lines should be coded in line styles.

Let's have a look at the Python code first:

# first I merge the three data sets into one, drop some columns
# I'm not interested in and sum the numbers by country, as I don't
# want to look at specific states here.

covid_long = pd.merge(
    pd.merge(
        confirmed_long,
        recovered_long,
        how="left"),
    deaths_long,
    how="left")\
    .drop(["Lat", "Long"], axis = 1)\
    .groupby(["Country/Region", "date"])\
    .sum()\
    .reset_index()

# I define a list of three countries I want to see in the plot

countries = ["US", "Germany", "India"]

# and set the y-scale of my axes object to log scale, since we expect
# exponential growth

fig, ax = plt.subplots()
ax.set_yscale('log')

for country in countries:
    covid_long.query('`Country/Region` == @country')\
              .plot(x="date", y="confirmed", ax=ax, label=country)

So far, these are just the numbers of confirmed cases. For plots in matplotlib, I know no better way than to plot the different countries sequentially or in a loop. To add the other cases, I add another loop.

# to plot confirmed, recovered and deaths cases in one plot
# we have to put a little effort into it

# we need a set of colors for our countries and a set of styles
# for the different status attributes

colors = plt.get_cmap('Set1', len(countries))

styles = ['-', '--', '.']

fig, ax = plt.subplots()
ax.set_yscale('log')

for i, status in enumerate(['confirmed', 'recovered', 'deaths']):
    for j, country in enumerate(countries):
        covid_long.query('`Country/Region` == @country')\
                        .plot(
                            x="date",
                            y=status,
                            label=f"{country} - {status}",
                            c=colors(j),
                            ax=ax,
                            style=styles[i])

While this is a bit messy in a single plot, we can use multiple subplots instead of line style, to show different status.

# we can split this up into multiple subplots for better overview

fig, ax = plt.subplots(3, 1)

for i, status in enumerate(['confirmed', 'recovered', 'deaths']):
    ax[i].set_yscale('log')
    ax[i].set_title(status)
    for j, country in enumerate(countries):
        covid_long.query('`Country/Region` == @country')\
                        .plot(
                            x="date",
                            y=status,
                            label=f"{country}",
                            c=colors(j),
                            ax=ax[i])

Finally, I want to compute the deviation of confirmed cases, to get the daily growth. Here's how I would do it with pandas:

# we pivot the table and compute the deviation. I don't know
# a more elegant way that to do it in multiple steps

covid_even_longer = pd.melt(
    covid_long,
    id_vars = ["Country/Region", "date"],
    var_name = "status",
    value_name = "count")

covid_even_longer.loc[:, 'lagged'] = (covid_even_longer\
                                      .sort_values(by=['date'], ascending=True)\
                                      .groupby(['Country/Region', 'status'])['count']\
                                      .shift(-1))

covid_even_longer = covid_even_longer\
    .assign(deviation = lambda x: x['lagged'] - x['count'])


fig, ax = plt.subplots()

for i, country in enumerate(countries):
    covid_even_longer.query('`Country/Region` == @country and status == "confirmed"')\
                    .plot(
                        x="date",
                        y="deviation",
                        label=country,
                        c=colors(i),
                        ax=ax)

So now to have a direct comparison, here's the tidyverse way to achieve the above results:

## we define a vector of countries we want to look at
countries <- c("US", "Germany", "India")

## plot the confirmed, recovered and deaths cases in one plot
## we again pivot our table and make use of ggplots aesthetics

covid_long %>%
  filter(Country.Region %in% countries) %>%
  pivot_longer(cols = c("confirmed", "recovered", "deaths"),
               names_to = "status",
               values_to = "count") %>%
  ggplot(aes(x = date, y = count, color = Country.Region)) +
    geom_line(aes(linetype = status), size = 1.2) +
    scale_y_log10()

## this looks a bit chaotic in a single plot, we can also use
## facet_grid to break this up into sub-plots

covid_long %>%
  filter(Country.Region %in% countries) %>%
  pivot_longer(cols = c("confirmed", "recovered", "deaths"),
               names_to = "status",
               values_to = "count") %>%
  ggplot(aes(x = date, y = count, color = Country.Region)) +
    facet_grid("status") +
    geom_line(size = 1.2) +
    scale_y_log10()

## it's easy to on-the-fly calculate a deviation to get daily
## change rates by groups and pipe the resulting dataframe into
## the plotting function again

covid_long %>%
  filter(Country.Region %in% countries) %>%
  pivot_longer(cols = c("confirmed", "recovered", "deaths"),
               names_to = "status",
               values_to = "count") %>%
  group_by(Country.Region, status) %>%
  mutate(deviation = lead(count) - count) %>%
  ungroup() %>%
  filter(status == "confirmed") %>%
  ggplot(aes(x = date, y = deviation, color = Country.Region)) +
    geom_line(aes(linetype = status), size = 1.2)

Summary

While there hardly any difference visible for simpler data transformations, merging and pivoting, the Python syntax becomes quite verbose when plotting slightly more complex data. The R workflow in contrast is straight forward, readable and arguably more beautiful.

Visualization in semantic letters allows to focus on what to show, not how. No loops necessary. It's not necessary to create as many temporary variables, since it simply doesn't take many extra steps to achieve what I want. Computing and adding new columns in my data frame is just done on the fly.

Of course, there is much more you can do with both libraries. I haven't touched theming at all, nor different plot types. However, this is a fairly common set-up that I have to deal with in my daily work, so although not a complete comparison, it should be a relevant one.

by Robin Schubert at June 22, 2020 12:00 AM

June 20, 2020

Jason Braganza (Personal)

We Need To Talk About the British Empire


It’s an Audible Original.
And it’s “free”, if you are an Audible member.

A look into what Empire means today.
It’s a series of engaging podcast episodes on what being a part of the British Empire meant/means for its subjects and its descendants, with stories from across the globe.

My only quibble being they did not go deep enough.
The people being interviewed are mostly, subject descendants of British origin (and not as I would expect, an actual survivor from the partition or someone here in India, or Somali coast or Sierra Leone.)

No one knows much of the aftermath of Partion and there’s a lovely little story, tightly told that illustrates the horrors of the period.
Somalia was used, abused, robbed of all they had and then left to find for itself.
Bunce Island, in Sierra Leone was home to a slaver’s bay, where the island hosted a prison for the captured natives who were branded and sold as slaves and a golf course for the gora sahibs.

What struck me (from the interviews and some of the reviews) is that the British have no clue of the generational ramifications of their actions.
They think that enough time has passed by, it’s all water under the bridge and we ought to have picked ourselves by our bootstraps by now.
I realise why Tharoor demanded reparations.

At about three hours, it’s well worth a listen.


by Mario Jason Braganza at June 20, 2020 10:43 AM

June 17, 2020

Anwesha Das

Software Licenses : Legalese to English

When I was doing a licensing survey in the Fedora ecosystem. I asked a few developers, "What is license according to them?" I got some interesting answers:

"I do not care about the license; it bores me." - a super senior developer says this. (not a very good example to follow)

"You have to fill up the name of a license to make the package in Fedora unless they won't accept the package" (sadly)

"License is something that protects your code." (Ahh finally some optimism)

The answer appeared as a ray of hope to me that yes, there are developers (still) who do care about code ( both their code and law).

But most of the developers (if not all) see licenses as long, huge legal documents filled with complicated words and some never-ending sentences. There are certain tags that people tend to associate with licenses like: MIT is permissive, GPL is strict, BSD is secure. Instead of going through the license document itself, developers choose the license for their project; based on these mental tags, they tag to it.

From today I am going to start a series to explaining different kinds of software licenses. The series will translate legalese into English. Together we will try to understand the Licenses and it's permissions one by one.

Before that, let us know what Open Source License and why it is essential?

Open Sourced software or hardware is the software or hardware that released under open source licenses. Open Source Licenses grant certain Intellectual Property Rights, in this case, Copyright, to their users.

The phrase Open Source software or hardware points to a software or hardware which

  • has its source code, design open and available to anybody and they can
  • examine,
  • modify
  • enhance and
  • share.

Open source projects products and communities based on the ethos of

  • collaboration, open and free exchange, and community sharing. Which results in a transparent process and better product or process. Open source Licenses stand in opposite to Closed Source/ Proprietary Licenses. For the last few years, there has been a surge in the usage of Open Source Licenses because

  • the product, process, or initiative is having open source licenses believed to be secure and stable.

  • the community has the control behind it.

  • the collaborative efforts produce a better product.

  • trains the programmers, designers to be better at their job through mentorship and community efforts.

Licenses lie at the very core of the Open source and Free Software community. But though many times we say “Open Source” and “Free and Open Source”(FOSS) in the same breath. But they are very different from each other, in the permissions and restrictions.

Choosing a license a developer marks the limit of its product, selects the community he wants to work with. So it is of utmost importance for them to know well. From the next post of this series, we will thrive to that goal.

by Anwesha Das at June 17, 2020 06:07 AM

June 07, 2020

Kuntal Majumder

Transitioning to Windows

So, recently I started using windows for work. Why? There are a couple of reasons, one that I needed to use MSVC, that is the Microsoft Visual C++ toolchain and the other being, I wasn’t quite comfortable to ifdef stuff for making it work on GCC aka, the GNU counterpart of MSVC.

June 07, 2020 02:38 PM

May 23, 2020

Bhavin Gandhi

Monitoring workstation with Prometheus

Prometheus is a monitoring system and a time series database. It can collect metrics from different places and store it as series of values over time. It uses pull based mechanism to collect the metrics. Applications can expose the metrics in a plain text format using HTTP server, which is then fetched by Prometheus. Fetching of metrics is called scraping. For other systems which don’t expose the metrics in Prometheus exposition format, we can use exporters.

by Bhavin Gandhi (bhavin192@removethis.geeksocket.in) at May 23, 2020 02:32 PM

May 09, 2020

Kuntal Majumder

Krita Weekly #14

After an anxious month, I am writing a Krita Weekly again and probably this would be my last one too, though I hope not. Let’s start by talking about bugs. Unlike the trend going about the last couple of months, the numbers have taken a serious dip.

May 09, 2020 04:12 PM

May 06, 2020

May 02, 2020

Abhilash Raj

Why did my machine reboot?

Yesterday, I was trying to debug why my server running Plex rebooted and wasn't able to figure it out. There are several new commands that I discovered, the best of which is called last.

This article was really useful.

who

who command prints which users are currently logged in:

$ who   
maxking  tty2         2020-04-30 22:17 (tty2)

But, you can use it with -b to figure out the last reboot time:

$ who -b       
         system boot  2020-04-30 22:16

last

last command prints a bunch of useful information, including

  • Time of last reboot, last reboot
  • time of last shutdown last shutdown
  • All the system restart/shutdown events and run level changes last -x

It also has a bunch of useful flags like, --present which also prints which users were present at the time (I really liked this one!)

$ last -p 14:40                                                                                                                                                                         1 ↵
maxking  tty2         tty2             Thu Apr 30 22:17   still logged in
reboot   system boot  5.6.7-300.fc32.x Thu Apr 30 22:16   still running

useful logs

I also found this stackexchange answer with a lot of useful log files that have information related to reboots like Kernel messages, systemd-journal etc.

by Abhilash Raj at May 02, 2020 11:11 PM

April 11, 2020

Shakthi Kannan

Using Docker with Ansible

[Published in Open Source For You (OSFY) magazine, October 2017 edition.]

This article is the eighth in the DevOps series. In this issue, we shall learn to set up Docker in the host system and use it with Ansible.

Introduction

Docker provides operating system level virtualisation in the form of containers. These containers allow you to run standalone applications in an isolated environment. The three important features of Docker containers are isolation, portability and repeatability. All along we have used Parabola GNU/Linux-libre as the host system, and executed Ansible scripts on target Virtual Machines (VM) such as CentOS and Ubuntu.

Docker containers are extremely lightweight and fast to launch. You can also specify the amount of resources that you need such as CPU, memory and network. The Docker technology was launched in 2013, and released under the Apache 2.0 license. It is implemented using the Go programming language. A number of frameworks have been built on top of Docker for managing these cluster of servers. The Apache Mesos project, Google’s Kubernetes, and the Docker Swarm project are popular examples. These are ideal for running stateless applications and help you to easily scale them horizontally.

Setup

The Ansible version used on the host system (Parabola GNU/Linux-libre x86_64) is 2.3.0.0. Internet access should be available on the host system. The ansible/ folder contains the following file:

ansible/playbooks/configuration/docker.yml

Installation

The following playbook is used to install Docker on the host system:

---
- name: Setup Docker
  hosts: localhost
  gather_facts: true
  become: true
  tags: [setup]

  tasks:
    - name: Update the software package repository
      pacman:
        update_cache: yes

    - name: Install dependencies
      package:
        name: "{{ item }}"
        state: latest
      with_items:
        - python2-docker
        - docker

    - service:
        name: docker
        state: started

    - name: Run the hello-world container
      docker_container:
        name: hello-world
        image: library/hello-world

The Parabola package repository is updated before proceeding to install the dependencies. The python2-docker package is required for use with Ansible. Hence, it is installed along with the docker package. The Docker daemon service is then started and the library/hello-world container is fetched and executed. A sample invocation and execution of the above playbook is shown below:

$ ansible-playbook playbooks/configuration/docker.yml -K --tags=setup
SUDO password: 

PLAY [Setup Docker] *************************************************************

TASK [Gathering Facts] **********************************************************
ok: [localhost]

TASK [Update the software package repository] ***********************************
changed: [localhost]

TASK [Install dependencies] *****************************************************
ok: [localhost] => (item=python2-docker)
ok: [localhost] => (item=docker)

TASK [service] ******************************************************************
ok: [localhost]

TASK [Run the hello-world container] ********************************************
changed: [localhost]

PLAY RECAP **********************************************************************
localhost                  : ok=5    changed=2    unreachable=0    failed=0   

With verbose ’-v’ option to ansible-playbook, you will see an entry for LogPath, such as /var/lib/docker/containers//-json.log. In this log file you will see the output of the execution of the hello-world container. This output is the same when you run the container manually as shown below:

$ sudo docker run hello-world

Hello from Docker!

This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://cloud.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/engine/userguide/

Example

A Deep Learning (DL) Docker project is available (https://github.com/floydhub/dl-docker) with support for frameworks, libraries and software tools. We can use Ansible to build the entire DL container from the source code of the tools. The base OS of the container is Ubuntu 14.04, and will include the following software packages:

  • Tensorflow
  • Caffe
  • Theano
  • Keras
  • Lasagne
  • Torch
  • iPython/Jupyter Notebook
  • Numpy
  • SciPy
  • Pandas
  • Scikit Learn
  • Matplotlib
  • OpenCV

The playbook to build the DL Docker image is given below:

- name: Build the dl-docker image
  hosts: localhost
  gather_facts: true
  become: true
  tags: [deep-learning]

  vars:
    DL_BUILD_DIR: "/tmp/dl-docker"
    DL_DOCKER_NAME: "floydhub/dl-docker"

  tasks:
    - name: Download dl-docker
      git:
        repo: https://github.com/saiprashanths/dl-docker.git 
        dest: "{{ DL_BUILD_DIR }}"

    - name: Build image and with buildargs
      docker_image:
         path: "{{ DL_BUILD_DIR }}"
         name: "{{ DL_DOCKER_NAME }}"
         dockerfile: Dockerfile.cpu
         buildargs:
           tag: "{{ DL_DOCKER_NAME }}:cpu"

We first clone the Deep Learning docker project sources. The docker_image module in Ansible helps us to build, load and pull images. We then use the Dockerfile.cpu file to build a Docker image targeting the CPU. If you have a GPU in your system, you can use the Dockerfile.gpu file. The above playbook can be invoked using the following command:

$ ansible-playbook playbooks/configuration/docker.yml -K --tags=deep-learning

Depending on the CPU and RAM you have, it will take considerable amount of time to build the image with all the software. So be patient!

Jupyter Notebook

The built dl-docker image contains Jupyter notebook which can be launched when you start the container. An Ansible playbook for the same is provided below:

- name: Start Jupyter notebook
  hosts: localhost
  gather_facts: true
  become: true
  tags: [notebook]

  vars:
    DL_DOCKER_NAME: "floydhub/dl-docker"

  tasks:
    - name: Run container for Jupyter notebook
      docker_container:
        name: "dl-docker-notebook"
        image: "{{ DL_DOCKER_NAME }}:cpu"
        state: started
        command: sh run_jupyter.sh

You can invoke the playbook using the following command:

$ ansible-playbook playbooks/configuration/docker.yml -K --tags=notebook

The Dockerfile already exposes the port 8888, and hence you do not need to specify the same in the above docker_container configuration. After you run the playbook, using the ‘docker ps’ command on the host system, you can obtain the container ID as indicated below:

$ sudo docker ps
CONTAINER ID        IMAGE                    COMMAND               CREATED             STATUS              PORTS                NAMES
a876ad5af751        floydhub/dl-docker:cpu   "sh run_jupyter.sh"   11 minutes ago      Up 4 minutes        6006/tcp, 8888/tcp   dl-docker-notebook

You can now login to the running container using the following command:

$ sudo docker exec -it a876 /bin/bash

You can then run an ‘ifconfig’ command to find the local IP address (“172.17.0.2” in this case), and then open http://172.17.0.2:8888 in a browser on your host system to see the Jupyter Notebook. A screenshot is shown in Figure 1:

Jupyter Notebook

TensorBoard

TensorBoard consists of a suite of visualization tools to understand the TensorFlow programs. It is installed and available inside the Docker container. After you login to the Docker container, at the root prompt, you can start Tensorboard by passing it a log directory as shown below:

# tensorboard --logdir=./log

You can then open http://172.17.0.2:6006/ in a browser on your host system to see the Tensorboard dashboard as shown in Figure 2:

TensorBoard

Docker Image Facts

The docker_image_facts Ansible module provides useful information about a Docker image. We can use it to obtain the image facts for our dl-docker container as shown below:

- name: Get Docker image facts
  hosts: localhost
  gather_facts: true
  become: true
  tags: [facts]

  vars:
    DL_DOCKER_NAME: "floydhub/dl-docker"

  tasks:
    - name: Get image facts
      docker_image_facts:
        name: "{{ DL_DOCKER_NAME }}:cpu"

The above playbook can be invoked as follows:

$ ANSIBLE_STDOUT_CALLBACK=json ansible-playbook playbooks/configuration/docker.yml -K --tags=facts 

The ANSIBLE_STDOUT_CALLBACK environment variable is set to ‘json’ to produce a JSON output for readability. Some important image facts from the invocation of the above playbook are shown below:

"Architecture": "amd64", 
"Author": "Sai Soundararaj <saip@outlook.com>", 

"Config": {

"Cmd": [
   "/bin/bash"
], 

"Env": [
   "PATH=/root/torch/install/bin:/root/caffe/build/tools:/root/caffe/python:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", 
   "CAFFE_ROOT=/root/caffe", 
   "PYCAFFE_ROOT=/root/caffe/python", 
   "PYTHONPATH=/root/caffe/python:", 
   "LUA_PATH=/root/.luarocks/share/lua/5.1/?.lua;/root/.luarocks/share/lua/5.1/?/init.lua;/root/torch/install/share/lua/5.1/?.lua;/root/torch/install/share/lua/5.1/?/init.lua;./?.lua;/root/torch/install/share/luajit-2.1.0-beta1/?.lua;/usr/local/share/lua/5.1/?.lua;/usr/local/share/lua/5.1/?/init.lua", 
   "LUA_CPATH=/root/torch/install/lib/?.so;/root/.luarocks/lib/lua/5.1/?.so;/root/torch/install/lib/lua/5.1/?.so;./?.so;/usr/local/lib/lua/5.1/?.so;/usr/local/lib/lua/5.1/loadall.so", 
   "LD_LIBRARY_PATH=/root/torch/install/lib:", 
   "DYLD_LIBRARY_PATH=/root/torch/install/lib:"
], 

"ExposedPorts": {
   "6006/tcp": {}, 
   "8888/tcp": {}
}, 

"Created": "2016-06-13T18:13:17.247218209Z", 
"DockerVersion": "1.11.1", 

"Os": "linux", 

"task": { "name": "Get image facts" }

You are encouraged to read the ‘Getting Started with Docker’ user guide available at http://docs.ansible.com/ansible/latest/guide_docker.html to know more about using Docker with Ansible.

April 11, 2020 06:30 PM

March 27, 2020

Pradyun Gedam

Testing the next-gen pip dependency resolver

This is an attempt to summarize the broader software architecture around dependency resolution in pip and how testing is being done around this area.

The motivation behind writing this, is to make sure all the developers working on this project are on the same page, and to have a written record about the state of affairs.

Architecture

The “legacy” resolver in pip, is implemented as part of pip’s codebase and has been a part of it for many years. It’s very tightly coupled with the existing code, isn’t easy to work with and has severe backward compatibility concerns with modifying directly – which is why we’re implementing a separate “new” resolver in this project, instead of trying to improve the existing one.

The “new” resolver that is under development, is not implemented as part of pip’s codebase; not completely anyway. We’re using an abstraction that separates all the metadata-generation-and-handling stuff vs the core algorithm. This allows us to work on the core algorithm logic (i.e. the NP-hard search problem) separately from pip-specific logic (eg. download, building etc). The abstraction and core algorithm are written/maintained in https://github.com/sarugaku/resolvelib right now. The pip-specific logic for implementing the “other side” of the abstraction is in https://github.com/pypa/pip/tree/master/src/pip/_internal/resolution/resolvelib.

Testing

In terms of testing, we have dependency-resolution-related tests in both resolvelib and pip.

resolvelib

The tests in resolvelib are intended more as “check if the algorithm does things correctly” and even contains tests that are agnostic to the Python ecosystem (eg. we’ve borrowed tests from Ruby, Swift etc). The goal here is to make sure that the core algorithm we implement is capable of generating correct answers (for example: not getting stuck in looping on the same “requirement”, not revisiting rejected nodes etc).

pip

The tests in pip is where I’ll start needing more words to explain what’s happening. :)

YAML-based tests

We have “YAML” tests which I’d written back in 2017, as a format to easily write tests for pip’s new resolver when we implement it. However, since we didn’t have a need for it to be working completely back then (there wasn’t a new resolver to test with it!), the “harness” for running these tests isn’t complete and would likely need some work to be as feature complete as we’d want it to be, for writing good tests.

YAML tests: https://github.com/pypa/pip/tree/master/tests/yaml
YAML test “harness”: https://github.com/pypa/pip/blob/master/tests/functional/test_yaml.py and https://github.com/pypa/pip/blob/master/tests/lib/yaml_helpers.py

“new” resolver tests

unit tests

We have some unit tests for the new resolver implementation. These cover very basic “sanity checks” to ensure it follows the “contract” of the abstraction, like “do the candidates returned by a requirement actually satisfy that requirement?”. These likely don’t need to be touched, since they’re fairly well scoped and test fairly low-level details (i.e. ideal for unit tests).

New resolver unit tests: https://github.com/pypa/pip/tree/master/tests/unit/resolution_resolvelib

functional tests

We also have “new resolver functional tests”, which are written as part of the current work. These exist since how-to-work-with-YAML-tests was not an easy question to answer and there needs to be work done (both on the YAML format, as well as the YAML test harness) to flag which tests should run with which resolver (both, only legacy, only new) and make it possible to put run these tests in CI easily.

New resolver functional tests: https://github.com/pypa/pip/blob/master/tests/functional/test_new_resolver.py

test_install*.py

These files test all the functionality of the install command (like: does it use the right build dependencies, does it download the correct files, does it write the correct metadata etc). There might be some dependency-resolution-related tests in test_install*.py files.

These files contain a lot of tests so, ideally, at some point, someone would go through and de-duplicate tests from this as well.

How can you help?

If you use pip, there are a multiple ways that you can help us!


Thanks to Paul Moore and Tzu-Ping for help in reviewing and writing this post, as well as Sumana Harihareswara for suggesting to put this up on my blog!

March 27, 2020 12:00 AM

February 19, 2020

Pradyun Gedam

OSS Work update #8

I’m trying to post these roughly once a month. Here’s the January post.

I am working on open source projects, as part of an internship at FOSSEE and as a part of grant-funded work on pip’s dependency resolver.

Work I did (Jan 6 - Feb 5)

Technical

  • Co-worked with another developer, in person, for 1 week, on pip!
  • Triaged pip’s issue tracker (a lot).
  • Spend some time improving pip’s test suite infrastructure.
  • Investigated Python 2 usage, to identify anomalies.
  • Helped with virtualenv 20.0 release (kinda!).
  • Invested effort to improve pip’s test suite
  • Helped aggregate test cases for pip’s next generation resolver.

Communication

  • Managed the pip 20.0 release fiasco.
  • Helped the UX folks get started with working on pip.

Additional notes on challenges

January has been a very productive month.

Most of the challenges have been the logistics around work, not the work as such.

My health has been pretty good and there’s a certain flow to my work that I’m enjoying now. Turns out, if you like what you’re doing, you tend to be pretty productive! :)

As long as I remember to push my blog posts to the repository, they’ll actually go live on the day they’re supposed to.

Goals for February 2020

Technical

  • Internal Cleansing: AKA Technical debt down payment.
  • Issue triage: Triage a fair number of issues on pip’s issue tracker.
  • Technical Documentation: improving pip’s technical documentation, for contributors and developers

Communication

  • Help all the other contractors to get up to “full speed” for working on pip
  • Get PyPA to participate in GSoC 2020
  • Python Packaging Summit at PyCon US 2020: help organization.
  • Move forward on Python Packaging Governance

Other commitments

None.

Help us

How can you help us?

  • provide test cases where the latest released version of pip (19.3.1, at the time of writing) fails to resolve dependencies properly (on zazo’s issue tracker). They will help us design and test the new resolver.
  • talk with your company about becoming a PSF sponsor. The Fundable Packaging Improvements page lists fairly well-scoped projects that would happen much faster if we get funding to achieve them.
  • Have an interview with our UX expert, who is working to improve usability of Python Packaging tooling.

February 19, 2020 12:00 AM

February 10, 2020

Armageddon

Building up simple monitoring on Healthchecks

I talked previously about deploying my own simple monitoring system.

Now that it's up, I'm only using it for my backups. That's a good use, for sure, but I know I can do better.

So I went digging.

Read more… (2 min remaining to read)

by Elia el Lazkani at February 10, 2020 11:00 PM

February 08, 2020

Armageddon

Simple cron monitoring with HealthChecks

In a previous post, I showed you how you can automate your borg backups with borgmatic.

After I started using borgmatic for my backups and hooked it to a cron running every 2 hours, I got interested into knowing what's happening to my backups at all times.

My experience comes handy in here, I know I need a monitoring system. I also know that traditional monitoring systems are too complex for my use case.

I need something simple. I need something I can deploy myself.

Read more… (2 min remaining to read)

by Elia el Lazkani at February 08, 2020 11:00 PM

January 19, 2020

Rahul Jha

"isn't a title of this post" isn't a title of this post

[NOTE: This post originally appeared on deepsource.io, and has been posted here with due permission.]

In the early part of the last century, when David Hilbert was working on stricter formalization of geometry than Euclid, Georg Cantor had worked out a theory of different types of infinities, the theory of sets. This theory would soon unveil a series of confusing paradoxes, leading to a crisis in the Mathematics community  regarding the stability of the foundational principles of the math of that time.

Central to these paradoxes was the Russell’s paradox (or more generally, as we’d talk about later, the Epimenides Paradox). Let’s see what it is.

In those simpler times, you were allowed to define a set if you could describe it in English. And, owing to mathematicians’ predilection for self-reference, sets could contain other sets.

Russell then, came up with this:

\(R\)  is a set of all the sets which do not contain themselves.

The question was "Does \(R \) contain itself?" If it doesn’t, then according to the second half of the definition it should. But if it does, then it no longer meets the definition.

The same can symbolically be represented as:

Let \(R = \{ x \mid x \not \in x \} \), then \(R \in R \iff R \not \in R \)

Cue mind exploding.

“Grelling’s paradox” is a startling variant which uses adjectives instead of sets. If adjectives are divided into two classes, autological (self-descriptive) and heterological (non-self-descriptive), then, is ‘heterological’ heterological? Try it!

Epimenides Paradox

Or, the so-called Liar Paradox was another such paradox which shred apart whatever concept of ‘computability’ was, at that time - the notion that things could either be true or false.

Epimenides was a Cretan, who made one immortal statement:

“All Cretans are liars.”

If all Cretans are liars, and Epimenides was a Cretan, then he was lying when he said that “All Cretans are liars”. But wait, if he was lying then, how can we ‘prove’ that he wasn’t lying about lying? Ein?


This is what makes it a paradox: A statement so rudely violating the assumed dichotomy of statements into true and false, because if you tentatively think it’s true, it backfires on you and make you think that it is false. And a similar backfire occurs if you assume that the statement is false. Go ahead, try it!

If you look closely, there is one common culprit in all of these paradoxes, namely ‘self-reference’. Let’s look at it more closely.

Strange Loopiness

If self-reference, or what Douglas Hofstadter - whose prolific work on the subject matter has inspired this blog post - calls ‘Strange Loopiness’ was the source of all these paradoxes, it made perfect sense to just banish self-reference, or anything which allowed it to occur. Russell and Whitehead, two rebel mathematicians of the time, who subscribed to this point of view, set forward and undertook the mammoth exercise, namely “Principia Mathematica”, which we as we will see in a little while, was utterly demolished by Gödel’s findings.

The main thing which made it difficult to ban self-reference was that it was hard to pin point where exactly did the self-reference occur. It may as well be spread out over several steps, as in this ‘expanded’ version of Epimenides:

The next statement is a lie.

The previous statement is true.

Russell and Whitehead, in P.M. then, came up with a multi-hierarchy set theory to deal with this. The basic idea was that a set of the lowest ‘type’ could only contain ‘objects’ as members (not sets). A set of the next type could then only either contain objects, or sets of lower types. This, implicitly banished self-reference.

Since, all sets must have a type, a set ‘which contains all sets which are not members of themselves’ is not a set at all, and thus you can say that Russell’s paradox was dealt with.

Similarly, if an attempt is made towards applying the expanded Epimenides to this theory, it must fail as well, for the first sentence to make a reference to the second one, it has to be hierarchically above it - in which case, the second one can’t loop back to the first one.

Thirty one years after David Hilbert set before the academia to rigorously demonstrate that the system defined in Principia Mathematica was both consistent (contradiction-free) and complete (i.e. every true statement could be evaluated to true within the methods provided by P.M.), Gödel published his famous Incompleteness Theorem. By importing the Epimenides Paradox right into the heart of P.M., he proved that not just the axiomatic system developed by Russell and Whitehead, but none of the axiomatic systems whatsoever were complete without being inconsistent.

Clear enough, P.M. lost it’s charm in the realm of academics.

Before Gödel’s work too, P.M. wasn’t particularly loved as well.

Why?

It isn’t just limited to this blog post, but we humans, in general, have a diet for self-reference - and this quirky theory severely limits our ability to abstract away details - something which we love, not only as programmers, but as linguists too - so much so, that the preceding paragraph, “It isn’t … this blog … we humans …” would be doubly forbidden because the ‘right’ to mention ‘this blog post’ is limited only to something which is hierarchically above blog posts, ‘metablog-posts’. Secondly, me (presumably a human) belonging to the class ‘we’ can’t mention ‘we’ either.

Since, we humans, love self-reference so much, let’s discuss some ways in which it can be expressed in written form.

One way of making such a strange loop, and perhaps the ‘simplest’ is using the word ‘this’. Here:

  • This sentence is made up of eight words.
  • This sentence refers to itself, and is therefore useless.
  • This blog post is so good.
  • This sentence conveys you the meaning of ‘this’.
  • This sentence is a lie. (Epimenides Paradox)

Another amusing trick for creating a self-reference without using the word ‘this sentence’ is to quote the sentence inside itself.

Someone may come up with:

The sentence ‘The sentence contains five words’ contains five words.

But, such an attempt must fail, for to quote a finite sentence inside itself would mean that the sentence is smaller than itself. However, infinite sentences can be self-referenced this way.

The sentence
    "The sentence
        "The sentence
                                ...etc
                                ...etc
        is infinitely long"
    is infinitely long"
is infinitely long"

There’s a third method as well, which you already saw in the title - the Quine method. The term ‘Quine’ was coined by Douglas Hofstadter in his book “Gödel Escher, Bach” (which heavily inspires this blog post). When using this, the self-reference is ‘generated’ by describing a typographical entity, isomorphic to the quine sentence itself. This description is carried in two parts - one is a set of ‘instructions’ about how to ‘build’ the sentence, and the other, the ‘template’ contains information about the construction materials required.

The Quine version of Epimenides would be:

“yields falsehood when preceded by it’s quotation” yields falsehood when preceded by it’s quotation

Before going on with ‘quining’, let’s take a moment and realize how awfully powerful our cognitive capacities are, and what goes in our head when a cognitive payload full of self-references is delivered - in order to decipher it, we not only need to know the language, but also need to work out the referent of the phrase analogous to ‘this sentence’ in that language. This parsing depends on our complex, yet totally assimilated ability to handle the language.

The idea of referring to itself is quite mind-blowing, and we keep doing it all the time — perhaps, why it feels so ‘easy’ for us to do so. But, we aren’t born that way, we grow that way. This could better be realized by telling someone much younger “This sentence is wrong.”. They’d probably be confused - What sentence is wrong?. The reason why it’s so simple for self-reference to occur, and hence allow paradoxes, in our language, is well, our language. It allows our brain to do the heavy lifting of what the author is trying to get through us, without being verbose.

Back to Quines.

Reproducing itself

Now, that we are aware of how ‘quines’ can manifest as self-reference, it would be interesting to see how the same technique can be used by a computer program to ‘reproduce’ itself.

To make it further interesting, we shall choose the language most apt for the purpose - brainfuck:

>>>>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++[->++++++++<]>--....[-]<<[<]<<++++++++[->+++++>++++++++<<]>+++>-->[[-<<<+>.>>]<.[->+<]<[->+<]>>>]<<<<[<]>[.>]

Running that program above produces itself as the output. I agree, it isn’t the most descriptive program in the world, so written in Python below, is the nearest we can go to describe what’s happening inside those horrible chains of +’s and >’s:

THREE_QUOTES = '"' * 3

def eniuq(template): print(
  f'{template}({THREE_QUOTES}{template}{THREE_QUOTES})')

eniuq("""THREE_QUOTES = '"' * 3

def eniuq(template): print(
  f'{template}({THREE_QUOTES}{template}{THREE_QUOTES})')

eniuq""")

The first line generates """ on the fly, which marks multiline strings in Python.

Next two lines define the eniuq function, which prints the argument template twice - once, plain and then surrounded with triple quotes.

The last 4 lines cleverly call this function so that the output of the program is the source code itself.

Since we are printing in an order opposite of quining, the name of the function is ‘quine’ reversed -> eniuq (name stolen from Hofstadter again)

Remember the discussion about how self-reference capitalizes on the processor? What if ‘quining’ was a built-in feature of the language, providing what we in programmer lingo call ‘syntactic sugar’?

Let’s assume that an asterisk, * in the brainfuck interpreter would copy the instructions before executing them, what would then be the output of the following program?

*

It’d be an asterisk again. You could make an argument that this is silly, and should be counted as ‘cheating’. But, it’s the same as relying on the processor, like using “this sentence” to refer to this sentence - you rely on your brain to do the inference for you.

What if eniuq was a builtin keyword in Python? A perfect self-rep was then just be a call away:

eniuq('eniuq')

What if quine was a verb in the English language? We could reduce a lot of explicit cognitive processes required for inference. The Epimenides paradox would then be:

“yields falsehood if quined” yields falsehood if quined

Now, that we are talking about self-rep, here’s one last piece of entertainment for you.

The Tupper’s self-referential formula

This formula is defined through an inequality:

\({1 \over 2} < \left\lfloor \mathrm{mod}\left(\left\lfloor {y \over 17} \right\rfloor 2^{-17 \lfloor x \rfloor - \mathrm{mod}(\lfloor y\rfloor, 17)},2\right)\right\rfloor\)

If you take that absurd thing above, and move around in the cartesian plane for the coordinates \(0 \le x \le 106, k \le y \le k + 17\), where \(k\) is a 544 digit integer (just hold on with me here), color every pixel black for True, and white otherwise, you'd get:

This doesn't end here. If \(k\) is now replaced with another integer containing 291 digits, we get yours truly:

January 19, 2020 06:30 PM

October 31, 2019

Shakthi Kannan

TeX User Group Conference 2019, Palo Alto

The Tex User Group 2019 conference was held between August 9-11, 2019 at Sheraton Palo Alto Hotel, in Palo Alto, California.

Stanford

I wanted to attend TUG 2019 for two main reasons - to present my work on the “XeTeX Book Template”, and also to meet my favourite computer scientist, Prof. Donald Knuth. He does not travel much, so, it was one of those rare opportunities for me to meet him in person. His creation of the TeX computer typesetting system, where you can represent any character mathematically, and also be able to program and transform it is beautiful, powerful and the best typesetting software in the world. I have been using TeX extensively for my documentation and presentations over the years.

Day I

I reached the hotel venue only in the afternoon of Friday, August 9, 2019, as I was also visiting Mountain View/San Jose on official work. I quickly checked into the hotel and completed my conference registration formalities. When I entered the hall, Rishi T from STM Document Engineering Private Limited, Thiruvananthapuram was presenting a talk on “Neptune - a proofing framework for LaTeX authors”. His talk was followed by an excellent poetic narration by Pavneet Arora, who happened to be a Vim user, but, also mentioned that he was eager to listen to my talk on XeTeX and GNU Emacs.

After a short break, Shreevatsa R, shared his experiences on trying to understand the TeX source code, and the lessons learnt in the process. It was a very informative, user experience report on the challenges he faced in navigating and learning the TeX code. Petr Sojka, from Masaryk University, Czech Republic, shared his students’ experience in using TeX with a detailed field report. I then proceeded to give my talk on the “XeTeX Book Template” on creating multi-lingual books using GNU Emacs and XeTeX. It was well received by the audience. The final talk of the day was by Jim Hefferon, who analysed different LaTeX group questions from newbies and in StackExchange, and gave a wonderful summary of what newbies want. He is a professor of Mathematics at Saint Michael’s College, and is well-known for his book on Linear Algebra, prepared using LaTeX. It was good to meet him, as he is also a Free Software contributor.

The TUG Annual General Meeting followed with discussions on how to grow the TeX community, the challenges faced, membership fees, financial reports, and plan for the next TeX user group conference.

Day II

The second day of the conference began with Petr Sojka and Ondřej Sojka presenting on “The unreasonable effectiveness of pattern generation”. They discussed the Czech hyphenation patterns along with a pattern generation case study. This talk was followed by Arthur Reutenauer presenting on “Hyphenation patterns in TeX Live and beyond”. David Fuchs, a student who worked with Prof. Donald Knuth on the TeX project in 1978, then presented on “What six orders of magnitude of space-time buys you”, where he discussed the design trade-offs in TeX implementation between olden days and present day hardware.

After a short break, Tom Rokicki, who was also a student at Stanford and worked with Donald Knuth on TeX, gave an excellent presentation on searching and copying text in PDF documents generated by TeX for Type-3 bitmap fonts. This session was followed by Martin Ruckert’s talk on “The design of the HINT file format”, which is intended as a replacement of the DVI or PDF file format for on-screen reading of TeX output. He has also authored a book on the subject - “HINT: The File Format: Reflowable Output for TeX”. Doug McKenna had implemented an interactive iOS math book with his own TeX interpreter library. This allows you to dynamically interact with the typeset document in a PDF-free ebook format, and also export the same. We then took a group photo:

Group photo

I then had to go to Stanford, so missed the post-lunch sessions, but, returned for the banquet dinner in the evening. I was able to meet and talk with Prof. Donald E. Knuth in person. Here is a memorable photo!

With Prof. Donald Knuth

He was given a few gifts at the dinner, and he stood up and thanked everyone and said that “He stood on the shoulders of giants like Isaac Newton and Albert Einstein.”

Gift to Prof. Donald Knuth< />

I had a chance to meet a number of other people who valued the beauty, precision and usefulness of TeX. Douglas Johnson had come to the conference from Savannah, Georgia and is involved in the publishing industry. Rohit Khare, from Google, who is active in the Representational State Transfer (ReST) community shared his experiences with typesetting. Nathaniel Stemen is a software developer at Overleaf, which is used by a number of university students as an online, collaborative LaTeX editor. Joseph Weening, who was also once a student to Prof. Donald Knuth, and is at present a Research Staff member at the Institute for Defense Analyses Center for Communications Research in La Jolla, California (IDA/CCR-L) shared his experiences in working with the TeX project.

Day III

The final day of the event began with Antoine Bossard talking on “A glance at CJK support with XeTeX and LuaTeX”. He is an Associate Professor of the Graduate School of Science, Kanagawa University, Japan. He has been conducting research regarding Japanese characters and their memorisation. This session was followed by a talk by Jaeyoung Choi on “FreeType MF Module 2: Integration of Metafont and TeX-oriented bitmap fonts inside FreeType”. Jennifer Claudio then presented the challenges in improving Hangul to English translation.

After a short break, Rishi T presented “TeXFolio - a framework to typeset XML documents using TeX”. Boris Veytsman then presented the findings on research done at the College of Information and Computer Science, University of Massachusetts, Amherst on “BibTeX-based dataset generation for training citation parsers”. The last talk before lunch was by Didier Verna on “Quickref: A stress test for Texinfo”. He teaches at École Pour l’Informatique et les Techniques Avancées, and is a maintainer of XEmacs, Gnus and BBDB. He also an avid Lisper and one of the organizers of the European Lisp Symposium!

After lunch, Uwe Ziegenhagen demonstrated on using LaTeX to prepare and automate exams. This was followed by a field report by Yusuke Terada, on how they use TeX to develop a digital exam grading system at large scale in Japan. Chris Rowley, from the LaTeX project, then spoke on “Accessibility in the LaTeX kernel - experiments in tagged PDF”. Ross Moore joined remotely for the final session of the day to present on “LaTeX 508 - creating accessible PDFs”. The videos of both of these last two talks are available online.

A number of TeX books were made available for free for the participants, and I grabbed quite a few, including a LaTeX manual written by Leslie Lamport. Overall, it was a wonderful event, and it was nice to meet so many like-minded Free Software people.

A special thanks to Karl Berry, who put in a lot of effort in organizing the conference, but, could not make it to due to a car accident.

The TeX User Group Conference in 2020 is scheduled to be held at my alma mater, Rochester Institute of Technology.

October 31, 2019 03:00 PM

October 16, 2019

September 24, 2019

Rahul Jha

A panegyric about my mentor, Omar Bhai

I was still up at this unearthly hour, thinking about life for a while now - fumbled thoughts about where I had come, where I started, and quite expectedly, Omar Bhai, your name popped in.

The stream continued. I started thinking about everything I’ve learned from you and was surprised with merely the sheer volume of thoughts that followed. I felt nostalgic!

I made a mental note to type this out the next day.

I wanted to do this when we said our final goodbyes and you left for the States, but thank God, I didn’t - I knew that I would miss you, but never could I have guessed that it would be so overwhelming - I would’ve never written it as passionately as I do today.

For those of you who don’t already know him, here’s a picture:

Omar Khursheed

I’m a little emotional right now, so please bear with me.

You have been warned - the words “thank you” and “thanks” appear irritatingly often below. I tried changing, but none other has quite the same essence.

How do I begin thanking you?

Well, let’s start with this - thank you for kicking me on my behind, albeit civilly, whenever I would speak nuisance (read chauvinism). I can’t thank you enough for that!

I still can’t quite get how you tolerated the bigot I was and managed to be calm and polite. Thank You for teaching me what tolerance is!

Another thing which I learnt from you was what it meant to be privileged. I can no longer see things the way I used to, and this has made a huge difference. Thank You!

I saw you through your bad times and your good. The way you tackled problems, and how easy you made it look. Well, it taught me [drum roll] how to think (before acting and not the other way round). Thank You for that too!

And, thank you for buying me books, and even more so, lending away so many of them! and even more so, educating me about why to read books and how to read them. I love your collection.

You showed all of us, young folks, how powerful effective communication is. Thank You again for that! I know, you never agree on this, but you are one hell of a speaker. I’ve always been a fan of you and your puns.

I wasn’t preparing for the GRE, but I sat in your sessions anyways, just to see you speak. The way you connect with the audience is just brilliant.

For all the advice you gave me on my relationships with people - telling me to back off when I was being toxic and dragging me off when I was on the receiving side - I owe you big time. Thank You!

Also, a hearty thank you for making me taste the best thing ever - yes, fried cheese it is. :D

Fried Cheese

Thank You for putting your trust and confidence in me!

Thank you for all of this, and much more!

Yours Truly, Rahul

September 24, 2019 06:30 PM

September 11, 2019

Priyanka Saggu

Some pending logs!

September 11, 2019

It’s been a very long time since I wrote here for the last.

The reason is nothing big but mainly because:

  1. Apparently, I was not able to finish some tasks in time that I used to write about.
  2. I was not well for a long time that could be an another reason .
  3. Besides, life happened in many ways which ultimately left me working on some other things first, because they seemed to be *important* for the time.

And, yes, there is no denying the fact that I was procastinating too because writing seems to be really hard at most times.

Though I had worked on many things throughout the time and I’ll try to write them here as short and quick logs below.



This one question always came up, many times, the students managed to destroy their systems by doing random things. rm -rf is always one of the various commands in this regard.

Kushal Das
  • While I was doing the above task, at one time I ruined my local system’s mail server configs and actually ended up doing something which kushal writes about in one of his recent post (quoted above). I was using the command rm -rf to clean some of the left-over dependencies of some mail packages, but that eventually resulted into machine being crashed. It was not the end of the mess this time. I made an another extremely big mistake meanwhile. I was trying to back up the crashed system, into an external hard disk using dd. But because I had never used dd before, so again I did something wrong and this time, I ended up losing ~500 GBs of backed up data. This is “the biggest mistake” and “the biggest lesson” I have learnt so far. 😔😭 (now I know why one should have multiple backups) And as there was absolutely no way of getting that much data back, the last thing I did was, formatting the hard-disk into 2 partitions, one with ext4 file system for linux backup and the other one as ntfs for everything else.

Thank you so much jasonbraganza for all the help and extremely useful suggestions during the time. 🙂


  • Okay, now after all the hassle bustle above, I got something really nice. This time, I received the “Raspberry Pi 4, 4GB, Complete Kit ” from kushal.

Thank you very much kushal for the RPi and an another huge thanks for providing me with all the guidance and support that made me reach to even what I am today. 🙂


  • During the same time, I attended a dgplug guest session from utkarsh2102. This session gave me a “really” good beginner’s insight of how things actually work in Debian Project. I owe a big thanks to utkarsh2102 as well, for he so nicely voluteered me from there onwards, to actually start with Debian project. I have started with DPMT and have done packaging 4 python modules so far. And now, I am looking forward to start contributing to Debian Ruby Team as well.

  • With the start of september, I spent some time solving some basic Python problems from kushal’s lymworkbook. Those issues were related to some really simply sys-admins work. But for me, working around and wrapping them in Python was a whole lot of learning. I hope I will continue to solve some more problems/issues from the lab.

  • And lastly (and currently), I am back to reading and implementing concepts from Ops School curriculum.

Voila, finally, I finish compiling up the logs from some last 20 days of work and other stuffs. (and thus, I am eventually finishing my long pending task of writing this post here as well).

I will definitely try to be more consistent with my writing from now onwards. 🙂

That’s all for now. o/

by priyankasaggu119 at September 11, 2019 05:28 PM

August 29, 2019

Sayan Chowdhury

Why I prefer SSH for Git?

Why I prefer SSH for Git?

In my last blog, I quoted

I'm an advocate of using SSH authentication and connecting to services like Github, Gitlab, and many others.

On this, I received a bunch of messages over IRC asking why do I prefer SSH for Git over HTTPS.


I find the Github documentation quite helpful when it comes down to learning the basic operation of using Git and Github. So, what has Github to say about "SSH v/s HTTPS"?

Github earlier used to recommend using SSH, but they later changed it to HTTPS. The reason for the Github's current recommendation could be:

  • Ease to start with: HTTPS is very easy to start with, as you don't have to set up your SSH keys separately. Once the account is created, you can just go over and start working with repositories. Though, the first issue that you hit is that you need to enter your username/password for every operation that you need to perform with git. This can be overcome by caching or storing the password using Git's credential storage. If you cache, then it is cached in memory for a limited period after which it is flushed so you need to enter your credentials again. I would not advise storing the password, as it is stored as plain-text on disk.
  • Easily accessible: HTTPS in comparison to SSH is easily accessible. Why? You may ask. The reason is a lot of times SSH ports are blocked behind a firewall and the only option left for you might be HTTPS. This is a very common scenario I've seen in the Indian colleges and a few IT companies.

Why do I recommend SSH-way?

SSH keys provide Github with a way to trust a computer. For every machine that I have, I maintain a separate set of keys. I upload the public keys to Github or whichever Git-forge I'm using. I also maintain a separate set of keys for the websites. So, for example, if I have 2 machines and I use Github and Pagure then I end up maintaining 4 keys. This is like a 1-to-1 connection of the website and the machine.

SSH is secure until you end up losing your private key. If you do end up losing your key, even then you can just login using your username/password and delete the particular key from Github. I agree, that the attacker can do nasty things but that would be limited to repositories and you would have control of your account to quickly mitigate the problem.

On the other side, if you end up losing your Github username/password to an attacker, you lose everything.

I also once benefitted from using SSH with Github, but IMO, exposing that also exposes a vulnerability so I'll just keep it a secret :)

Also, if you are on a network that has SSH blocked, you can always tunnel it over HTTPS.

But, above all, do use 2-factor authentication that Github provides. It's an extra layer of security to your account.

If you have other thoughts on the topic, do let me know over twitter @yudocaa, or drop me an email.


Photo by Christian Wiediger on Unsplash

by Sayan Chowdhury at August 29, 2019 11:55 AM

August 26, 2019

Saptak Sengupta

Configuring Jest with React and Babel

Jest is a really good frontend testing framework and works great with React and Babel out of the box, along with Enzyme for component testing. But, imports with React and Babel can often be filled with nasty imports. I wrote in a previous blog about how to make better more cleaner imports using some webpack tweaks.

But the problem appears when we try to write Jest and Enzyme tests with them. Because Babel can now longer understand and parse the imports. And without Babel parsing them and converting to ES5, jest cannot test the components. So we actually need a mix of Babel configuration and Jest configuration.

Note: This post assumes you already have jest, babel-jest and babel/plugin-transform-modules-commonjs packages installed using your favorite javascript package manager.

Basically, the workaround is first we need to resolve the cleaner imports into the absolute paths for the import using Jest configurations, and then use the Babel configurations to parse the rest code (without the modules) to ES5.

The configuration files look something like these:

babel.config.js

module.exports = api => {
const isTest = api.env('test');
if (isTest) {
return {
presets: [
[
'@babel/preset-env',
{
modules: false,
},
],
'@babel/preset-react',
],
plugins: [
"@babel/plugin-transform-modules-commonjs",
],
}
} else {
return {
presets: [
[
'@babel/preset-env',
{
modules: false,
},
],
'@babel/preset-react',
],
}
}
};


jest.config.js

module.exports = {
moduleNameMapper: {
'^~/(.*)$': '<rootDir>/path/to/jsRoot/$1'
}
}

So let's go through the code a little.

In babel.config.js, we make a check to see if the code is right now in test environment. This is helpful because
  1. Jest sets the environment to "test" when running a test so it is easily identifiable
  2. It ensures that the test configuration don't mess up with the non test configurations in our Webpack config (or any other configuration you are using)
So in this case, I am returning the exact same Babel configuration that I need in my Webpack config in non-test environment.

In the test configuration for Babel, we are using a plugin "@babel/plugin-transform-modules-commonjs". This is needed to parse all the non component imports like React, etc. along with parsing the components from ES6 to ES5 after jest does the path resolution. So it helps to convert the modules from ES6 to ES5.

Now, let's see the jest.config.js. The jest configuration allows us to do something called moduleNameMapper. This is a very useful configuration in many different usecases. It basically allows us to convert the module names or paths we use for module import to something that jest understands (or in our case, something that the Babel plugin can parse). 

So, the left hand part of the attribute contains a regular expression which matches the pattern we are using for imports. Since our imports look something like '~/path/from/jsRoot/Component', so the regular expression to capture all such imports is '^~/(.*)$'. Now, to convert them to absolute paths, we need to append '<rootDir>/path/to/jsRoot/' in front of the component path.

And, voila! That should allow Jest to properly parse, convert to ES5 and then test.

The best part? We can use the cleaner imports even in the .test.js files and this configuration will work perfectly with that too.

by SaptakS (noreply@blogger.com) at August 26, 2019 06:16 AM

Making cleaner imports with Webpack and Babel

You can bring in modules from different javascript file using require based javascript code, or normal Babel parse-able imports. But the code with these imports often become a little bad because of relative imports like:

import Component from '../../path/to/Component'

But a better, more cleaner way of writing ES6 imports is

import Component from '~/path/from/jsRoot/Component'

This hugely avoids the bad relative paths  for importing depending on where the component files are. Now, this is not parse-able by babel itself. But you can parse this by webpack itself using it's resolve attribute. So your webpack should have these two segments of code:

resolve: {
        alias: {
            '~': __dirname + '/path/to/jsRoot',
            modernizr$: path.resolve(__dirname, '.modernizrrc')
        },
        extensions: ['.js', '.jsx'],
        modules: ['node_modules']
    },

and

module: {
        rules: [
            {
                test: /\.jsx?$/,
                use: [
                    {
                        loader: 'babel-loader',
                        query: {
                            presets: [
                                '@babel/preset-react',
                                ['@babel/preset-env', { modules: false }]
                            ],
                        },
                    }
                ],
            },
}

The {modules: false} ensures that babel-preset-env doesn't handle the parsing of the module imports. You can check the following comment in a webpack issue to know more about this.

by SaptakS (noreply@blogger.com) at August 26, 2019 05:56 AM

July 29, 2019

Sayan Chowdhury

Force git to use git:// instead of https://

Force git to use git:// instead of https://

I'm an advocate of using SSH authentication and connecting to services like Github, Gitlab, and many others. I do make sure that the use the git:// URL while cloning the repo but sometimes I do make mistake of using the https:// instead. Only to later realise when git prompts me to enter my username to authenticate the SSH connection. This is when I have to manually reset my git remote URL.

Today, I found a cleaner solution to this problem. I can use insteadOf to enforce the connection via SSH.

git config --global url."git@github.com:".insteadOf "https://github.com/"

This creates an entry in your .gitconfig:

[url "git@github.com:"]
	insteadOf = https://github.com/

Photo by Yancy Min on Unsplash

by Sayan Chowdhury at July 29, 2019 06:52 AM

April 08, 2019

Subho

Increasing Postgres column name length

This blog is more like a bookmark for me, the solution was scavenged from internet. Recently I have been working on an analytics project where I had to generate pivot transpose tables from the data. Now this is the first time I faced the limitations set on postgres database. Since its a pivot, one of my column would be transposed and used as column names here, this is where things started breaking. Writing to postgres failed with error stating column names are not unique. After some digging I realized Postgres has a column name limitation of 63 bytes and anything more than that will be truncated hence post truncate multiple keys became the same causing this issue.

Next step was to look at the data in my column, it ranged from 20-300 characters long. I checked with redshift and Bigquery they had similar limitations too, 128 bytes. After looking for sometime found a solution, downloaded the postgres source, changed NAMEDATALEN to 301(remember column name length is always NAMEDATALEN – 1) src/include/pg_config_manual.h, followed the steps from postgres docs to compile the source and install and run postgres. This has been tested on Postgres 9.6 as of now and it works.

Next up I faced issues with maximum number columns, my pivot table had 1968 columns and postgres has a limitation of 1600 total columns. According to this answer I looked into the source comments and that looked quite overwhelming 😛 . Also I do not have a control over how many columns will be there post pivot so no matter whatever value i set , in future i might need more columns, so instead I handled the scenario in my application code to split the data across multiple tables and store them.

References:

  1. https://til.hashrocket.com/posts/8f87c65a0a-postgresqls-max-identifier-length-is-63-bytes
  2. https://stackoverflow.com/questions/6307317/how-to-change-postgres-table-field-name-limit
  3. https://www.postgresql.org/docs/9.6/install-short.html
  4. https://dba.stackexchange.com/questions/40137/in-postgresql-is-it-possible-to-change-the-maximum-number-of-columns-a-table-ca

by subho at April 08, 2019 09:25 AM