Planet dgplug

A (Corona) walk in the park

lone strollers with dogs and bottles of wine
going out for walks under basswood and pine

children singing and whirling around
playin' their favorite games on a crowded playground

an army of joggers on bouncing feet
elderly people tryin' to cross the street

the twilight sky, dipped in orange and blue
the silvery loom of the crescent moon

more faces than I have seen in weeks
November wind gives them rosy cheeks

from football games and from jam sessions

engrossed in music engrossed in sports
engrossed in coffee, talks and thoughts

a spirit of life and truth and hope
and the occasional smell of dope

like collecting pieces to make me whole
like an emergency bandage for my soul

Putting Emacs Backup Files in a Separate Location

Whenever Emacs saves a file, it makes a backup of the original.
So if I had a file.txt and I make changes and save it, Emacs firsts backs up the original to file.txt~.
While I love this functionality, and it has saved me from a pickle more than once, I don’t love the way my folders get polluted with ~ files all over the place.
My blog’s drafts folder had hundreds of these.

Mercifully, Emacs offers a way to stow all these files in one central folder of your choosing.
All I did was add this little snippet to my init.el

;; Set the folder for backup files to a subfolder in the
;;.emacs.d folder of the user.

(setq backup-directory-alist
'(("." . "~/.emacs.d/file-backups")))


Emacs now stops littering all over the place and saves the backups it creates, to a file-backups folder in my .emacs.d folder.
You can choose a location, you like better :)

P.S. Subscribe to my mailing list!
P.P.S. Feed my insatiable reading habit.

Jason Braganza (Work)

Could not focus much on programming today.
So decided on doing things with Python programs.

I use Nikola to generate both my websites.
It is an extremely easy to use, no fuss static site generator, which is easy on my server’s resources.
Version 8.1.2 was released a few hours ago and I hopped on and installed it.
I follow a slightly unconventional upgrade path, because I was terrified of breaking my server in the early days, when I was still learning about how to go about installing things on servers.

• I have Miniconda installed.
• I use that to generate a conda environment, which I then installed Nikola into.
• When a new release drops, I create a new conda environment, install the new release in there and run them against my source folders (after backing them up).
• This lets me revert very quickly to the old data and the older version of Nikola, in case I do something boneheaded and screw things up.
• If all works fine for a month, I delete the older conda environment along with the old Nikola release.
• I have been doing this for quite a while now, and while it may be overkill, it gives me peace of mind.
• As usual, Nikola upgraded with no issues at all.

Pleroma Bot

I always wanted to understand what bots do.
Now I realise they are athromorphic programs that (with the right permissions and the right credentials) look like actual users of a service, doing sets of activities they are programmed to do.
Like your banking app has set of solutions to common queries that it shows you, before it hands off to an actual human, if those solutions don’t fit your needs.
They feel like supercharged scripts to me.
So I decided to see if I could install one.
Since I am learning Python, I love reading the low volume Daily Python Tip Twitter account.
It has surprising, handy, funny, interesting tips and tricks about the Python language and the massive ecosystem around it.
But, I have weaned myself off Twitter for my own sanity.
I only use it sparingly once a day.
And going back to that noise and tumult no longer interests me.
I saw a bot that mirrored tech news accounts to Mastodon and wondered if a bot could get Daily Python Tip to me on my Pleroma timeline.
A quick search led me to Pleroma Bot.
A couple of struggling hours later, (creating a twitter developer account, creating a pleroma account for the bot, figuring out how to get bearer tokens for said accounts) and tada, I got the bot to come alive!
It checks the Twitter account once a day, and mirrors the tweets to the bot account.
You’re most welcome to follow it for a Python tip, daily.
The next thing to do, is to see if I can get a bot to post a tagged pleroma status to my twitter account.
But that is something for another day …
Update, 2020/11/17: figured out how to use the bot to mirror multiple accounts. It now mirrors, Daily Python Tip, RegexTip, and CompSciFact.

P.S. Subscribe to my mailing list!
P.P.S. Feed my insatiable reading habit.

How to use Hugo Modules

I use Hugo for this site. I migrated to Hugo from WordPress a year ago. Hugo is a static site generator written in Go. Hugo project has frequent releases, so I usually update the version once in a few months. This involves reading all the release notes and making any changes to the theme if required. The theme changes are required rarely though. While going through the release notes of v0.

What I Learnt from Antifragile (III)

This post was sent to my newsletter on October 25th, 2020.
You really ought to subscribe :)

What I Learnt from Antifragile (III)

What Does Not Kill Me … Antifragility for the Collective

What does not kill me makes me stronger.
— Friedrich Nietzsche, Maxims and Arrows

The world breaks everyone and afterward many are strong at the broken places.
But those that will not break it kills.
It kills the very good and the very gentle and the very brave impartially.
If you are none of these you can be sure it will kill you too but there will be no special hurry.
— Ernest Hemingway, A Farewell to Arms

That, in a nutshell, explains my learning today.
To become Antifragile, we need to understand be aware of what we do, our actions and then see who it ultimately benefits.

Antifragility, like I imagined in my head, was me getting stronger with every blow that life dealt me, Hydra like.
If you cut off one head, I would just grow another one.
But I realised that everything comes at the expense of something.
If I am growing Antifragile, something else has to give.
If there’s just nature on the otherside, we’re fine. But if there are people, then we better be careful about how we get Antifragile.

There’s also the notion of scale. Something small, dying to make the bigger collective stronger.
So there’s always a balancing act and a constant need to be observant.

This is how I imagine it, in a few scenarios in my life.

My cells need to die, my muscles need to tear in order that I build up my strength.
Here the individual is nature (my cells), dying in order to make me (my body, the collective) much stronger.
Imagine if that did not happen, if each cell decided, why should I die?
Actually, you don’t have to imagine.
That’s what cancer is.

If I die this year, I am insured. My folks get a hefty payout. (Antifragility for my family’s finances)
But the insurance company can only afford to do this, because they have money from a ton of folks placing the same bet as I did. That we would croak this year.
This is again the individual (me) benefiting the collective (the vast pool of people, who do not have enough saved, yet want to provide for their loved ones).

A nasty one is when someone uses this for their own benefit.
They win, but the collective loses.
Scams of various sorts come to mind here, where an unscrupulous person, takes advantage and become antifragile at the benefit of other people.

The flip side is the hero, where the individual takes risks and sacrifices to benefit the collective
Soldiers, Teachers, Firefighters, Tinkerers, Entrepreneurs are all examples of folks who sacrifice individually so that society as a whole benefits.

So this is what I learnt. To see how actions towards antifragility (mine or others) have ramifications on my life.
To always see if I or society benefit. Not someone else taking undue advantage.

P.S. Subscribe to my mailing list!
Forward these to your friends and get them to subscribe!
P.P.S. Feed my insatiable reading habit.

How to use Yubikey or any GPG smartcard in Thunderbird 78

Thunderbird is the free and open source email client by Mozilla Foundation. I have been using it for some years now. Till now the Thunderbird users had to use an extension Enigmail to use GnuPG. Thunderbird 78 now uses a different implementation of OpenPGP called RNP.

Since RNP library still does not support the use of secret key on smartcards, to use Yubikey or any other GnuPG enabled smartcards, we need manually configure Thunderbird with GnuPG. The steps as said are the following :

Install GPGME

dnf install GPGME


GPGME, GnuPG Made Easy library makes the GnuPG easily accessible by providing a high level crypto API for encrypt, decrypt, sign, verify and key management. I already have GnuPG installed in my Fedora 33 machine and my Yubikey ready.

Modify Thunderbird configuration

Go to the Preferences menu then click on the config editor button at the very end.

Click on the I accept the risk.

Search for mail.openpgp.allow_external_gnupg and switch to true.

Remember to restart the Thunderbird after that.

Configure the secret key usage form Yubikey

Now go to the Account Settings and then go to the End-To-End-Encryption at the sidebar. Select the Use your external key through GnuPG(e.g. from a smartcard) option and click on continue.

Type your Secret Key ID in the box and click on Save key ID.

Now open the OpenPGP Key Manager and import your public key and then verify.

Now you can start using your hardware token in Thunderbird.

In this case we have to use 2 keyrings - GnuPG and RNP’s keyring (internal in Thunderbird). This is an extra step, which I hope in future can be avoided.

Using Mailvelope with Yubikey in Linux

Mailvelope is an extension on web browsers to send end to end encrypted emails. This is a good option available to the users to send end to end encrypted without changing the email service they use. It is licensed under AGPL v3, making it Free and Open Source software. The code is there in Github for the community to have a look. This can be added as an extension to the - Chrome, Firefox and Edge browsers to securely encrypt emails with PGP using your email providers.

Mailvelope does provide end to end encryption for the email content but does not protect the metadata (subject, IP address of the sender) from third parties. As most of the email encryption tools, it does not work on the mobile browser. There is a detailed user guide on Mailvelope from the Freedom of the Press Foundation, which is really helpful for the new users.

By default, Mailvelope uses its own keyring. To use my Yubikey along with GnuPG keyring, I had to take the following steps:

Install gpgme

We need gpgme installed. On my Fedora 33 I did

$sudo dnf install gpgme -y  For Chrome browser We have to create gpgmejson.json .json file in the ~/.config/google-chrome/NativeMessagingHosts directory write the following json in there. { "name": "gpgmejson", "description": "Integration with GnuPG", "path": "/usr/bin/gpgme-json", "type": "stdio", "allowed_origins": [ "chrome-extension://kajibbejlbohfaggdiogboambcijhkke/" ] }  For Firefox mkdir -p ~/.mozilla/native-messaging-hosts ​  After creating the native-messaging-hosts directory inside the Mozilla directory, add gpgmejson.json file there with the following content. vim ~/.mozilla/native-messaging-hosts/gpgmejson.json ​  { "name": "gpgmejson", "description": "Integration with GnuPG", "path": "/usr/bin/gpgme-json", "type": "stdio", "allowed_extensions": [ "jid1-AQqSMBYb0a8ADg@jetpack" ] } ​  Remember to restart the respective browser after you add the .json file. Then go to the Mailvelope extension to select the GnuPG keyring. November 02, 2020 Kushal Das Johnnycanencrypt 0.4.0 released Last night I released 0.4.0 of johnnycanencrypt module for OpenPGP in Python. This release has one update in the creating new key API. Now, we can pass one single UID as a string, or multiple in a list, or even pass None to the key creation method. This means we can have User ID-less certificates, which sequoia-pgp allows. I also managed to fix the bug so that users can use pip to install the latest release from https://pypi.org. You will need the rust toolchain, I generally install from https://rustup.rs. For Fedora sudo dnf install nettle clang clang-devel nettle-devel python3-devel  For Debian/Ubuntu sudo apt install -y python3-dev libnettle6 nettle-dev libhogweed4 python3-pip python3-venv clang  Remember to upgrade your pip version inside of the virtual environment if you are in Buster. For macOS Install nettle via brew. Installing the package ❯ python3 -m pip install johnnycanencrypt Collecting johnnycanencrypt Downloading https://files.pythonhosted.org/packages/50/98/53ae56eb208ebcc6288397a66cf8ac9af5de53b8bbae5fd27be7cd8bb9d7/johnnycanencrypt-0.4.0.tar.gz (128kB) |████████████████████████████████| 133kB 6.4MB/s Installing build dependencies ... done Getting requirements to build wheel ... done Preparing wheel metadata ... done Building wheels for collected packages: johnnycanencrypt Building wheel for johnnycanencrypt (PEP 517) ... done Created wheel for johnnycanencrypt: filename=johnnycanencrypt-0.4.0-cp37-cp37m-macosx_10_7_x86_64.whl size=1586569 sha256=41ab04d3758479a063a6c42d07a15684beb21b1f305d2f8b02e820cb15853ae1 Stored in directory: /Users/kdas/Library/Caches/pip/wheels/3f/63/03/8afa8176c89b9afefc11f48c3b3867cd6dcc82e865c310c90d Successfully built johnnycanencrypt Installing collected packages: johnnycanencrypt Successfully installed johnnycanencrypt-0.4.0 WARNING: You are using pip version 19.2.3, however version 20.2.4 is available. You should consider upgrading via the 'pip install --upgrade pip' command.  Now, you can import the module inside of your virtual environment :) Note: In the future, I may change the name of the module to something more meaningful :) November 01, 2020 Kushal Das High load average while package building on Fedora 33 Enabling Link time optimization (LTO) with rpmbuild is one of the new features of Fedora 33. I read the changeset page once and went back only after I did the Tor package builds locally. While building the package, I noticed that suddenly there are many processes with /usr/libexec/gcc/x86_64-redhat-linux/10/lto1 and my load average reached 55+. Here is a screenshot I managed to take in between. Jason Braganza (Personal) What I Learnt from Antifragile (II) This post was sent to my newsletter on October 18th, 2020 You really ought to subscribe :) What I Learnt from Antifragile (II) I fell sick and missed writing last week. I have to live up the name of the news letter, anyhoo. It would not be erratic without me whiffing once in a while, non? Apologies all around, anyway! The Barbell Heuristic to Taking Risks Basically a shortcut to figuring out whether you ought to do something or not, based on the risks it entails. How do we take a decision, when we don’t know all the pros and cons? How do we decide in an uncertain world? Simply put, What kind / amount of loss am I willing to accept, to gain some reward? Life basically consists of three outcomes: 1. The safe outcome, with little to no success, but you don’t lose anything 2. The normal outcome, with some middling success, but you stand to lose some. • But here’s the kicker. You might stand to lose everything, if you don’t understand the risks you take, or if the risk is unknowable. (Blow ups due to these unknown risks are what are now popularly called, thanks to Taleb’s earlier book, Black Swans) 3. High Risk, high reward! You know you will lose, but if you win, you win Big! It helps if you take risks with a domain that you have deep expertise in. Taleb suggests that the best and safest way, to make decisions that propel your forward with minimal risk, is to ignore point 2 altogether. Most of your daily life decisions ought to be with point 1. Some of your decisions you, go to point 3. Your strategy is to be as hyper-conservative and hyper-aggressive as you can be, instead of being mildly aggressive or conservative. — Taleb, The Black Swan And that is the barbell strategy. It looks like an unbalanced barbell actually, like the one on the cover of his next book, Skin in the Game. This is what will give you maximum peace of mind. Risk taking also becomes easier, if you have options like I pointed out last week. The best worst case scenario, is one that you have the option to reverse. (like buying something you need, but uncertain about how it’ll be? Easy to buy it, if the thing comes with an option to return it if you don’t like it.) Three personal life cases, Money I don’t understand investing. I do know, that I need a nest egg for when Abby & I are old :) Ergo, barbell strategy. Most of our money is parked in safe investments like the Provident Fund and fixed deposits. And some of it, in risky stuff like stocks and equity mutual funds. (since this is not my domain of expertise, I pay someone trustworthy to help me out.) So if the market crashes like it did earlier this year, I was not as worried as other folk. I did not lose my shirt. I could follow, what the given advice at the time was (Stay Invested) with a clear, calm mind. And I am confident, compounding will work its magic over time over both sets of investments. Work Let’s put the barbell, to work here too. Your safe, boring job is one end of the spectrum. Your risky side projects, hustles, are the other. You need both. You could either do both together, or serially. Work a safe job for a few years, then take up a risky moonshot, and if it doesn’t pan out, go back to another safe job. While right now, I am in between jobs, due to health reasons, in my earlier lives, I had pretty boring jobs. But, I write a lot. I teach a lot. And that brought me a lot of opportunities that helped me through really trying times in the past two decades. Health This is relevant to me, because it helped me get fit over the past year. I have three slipped discs. Do I have surgery? The docs are undecided. Or rather they are, but are unwilling to guarantee, how long the surgery would help me. And I had bloated to 98 odd kgs at my worst. What do I do? So, the safe thing to do was to lose weight. And the hard thing to do was to get strong. Both of which, are progressing along nicely. I lost 30 kgs and am doing aggressive physio to build up my back muscles, so that my poor vertebrae don’t have to do all the heavy lifting by themselves. I feel much better today, better than I did in my twenties! And that about does it for using barbells to help make decisions. That’s all we have for you this week :) I really do hope, this little heuristic, changes your life as much as it did mine :) P.S. Subscribe to my mailing list! Forward these to your friends and get them to subscribe! P.P.S. Feed my insatiable reading habit. October 04, 2020 Armageddon Dotfiles with /Chezmoi/ A few months ago, I went on a search for a solution for my dotfiles. I tried projects likes GNU Stow, dotbot and a bare git repository. Each one of these solutions has its advantages and its advantages, but I found mine in Chezmoi. Chezmoi ? That's French right ? How is learning French going to help me ? Introduction On a *nix system, whether Linux, BSD or even Mac OS now, the applications one uses have their configuration saved in the user's home directory. These files are called configuration files. Usually, these configuration files start with a . which on these systems designate hidden files (they do not show up with a simple ls). Due their names, these configuration files are also referred to as dotfiles. Note I will be using dotfiles and configuration files interchangeably in this article, and they can be thought as such. One example of such files is the .bashrc file found in the user's home directory. It allows the user to configure bash and change some behaviours. Now that we understand what dotfiles are, let's talk a little bit about the previously mentioned solutions. They deserve mentioning, especially if you're looking for such solution. GNU Stow GNU Stow leverages the power of symlinks to keep your configuration in a centralized location. Wherever your repository lives, GNU Stow will mimic the internal structure of said repository in your home directory by smartly symlinking everything. I said smartly because it tries to minimize the amount of symlinks created by symlinking to common root directories if possible. By having all your configuration files under one directory structure, it is easier to push it to any public repository and share it with others. The downsize is, you end-up with a lot of symlinks. It is also worth mentioning that not all applications behave well when their configuration directories are symlinked. Otherwise, GNU Stow is a great project. Dotbot Dotbot is a Python project that aims at automating your dotfiles. It gives you great control over what and how to manage your dotfiles. Having it written in Python means it is very easy to install; pip. It also means that it should be easy to migrate it to different systems. Dotbot has a lot going for it. If the idea of having control over every aspect of your dotfiles, including the possibility of the setup of the environment along with it, then dotbot is for you. Well, it's not for me. Bare Git Repository This is arguably the most elegant solution of them all. The nice thing about this solution is its simplicity and cleanliness. It is essentially creating a bare git repository somewhere in your home directory specifying the home directory itself to be the working directory. If you are wondering where one would use a bare git repository in real life other than this use case. Well, you have no other place to turn than any git server. On the server, Gitea for example, your repository is only a bare repository. One has to clone it to get the working directory along with it. Anyway, back to our topic. This is a great solution if you don't have to worry about things you would like to hide. By hide, I mean things like credentials, keys or passwords which never belong in a repository. You will need to find solutions for these types of files. I was looking for something less involving and more involved. Chezmoi to the rescue ? Isn't that what they all say ? I like how the creator(s) defines Chezmoi Manage your dotfiles across multiple machines, securely. Pretty basic, straight to the point. Unfortunately, it's a little bit harder to grasp the concept of how it works. Chezmoi basically generates the dotfiles from the local repository. These dotfiles are saved in different forms in the repository but they always generate the same output; the dotfiles. Think of Chezmoi as a dotfiles templating engine, at its basic form it saves your dotfiles as is and deploys them in any machine. Working with Chezmoi I think we should take a quick look at Chezmoi to see how it works. Chezmoi is written Golang making it fairly easy to install so I will forgo that boring part. First run To start using Chezmoi, one has to initialize a new Chezmoi repository. chezmoi init  This will create a new git repository in ~/.local/share/chezmoi. This is now the source state, where Chezmoi will get your dotfiles. Plain dotfiles management with Chezmoi Now that we have a Chezmoi repository. We can start to populate it with dotfiles. Let's assume that we would like to start managing one of our dotfiles with Chezmoi. I'm going with an imaginary application's configuration directory. This directory will hold different files with versatile content types. This is going to showcase some of Chezmoi's capabilities. Note This is how I use Chezmoi. If you have a better way to do things, I'd like to hear about it! Adding a dotfile This DS9 application has its directory configuration in ~/.ds9/ where we find the config. The configuration looks like any generic ini configuration. [character/sisko] Name = Benjamin Rank = Captain Credentials = sisko-creds.cred Mastodon = sisko-api.mastodon  Nothing special about this file, let's add it to Chezmoi chezmoi add ~/.ds9/config  Listing dotfiles And nothing happened… Hmm… chezmoi managed  /home/user/.ds9 /home/user/.ds9/config  Okay, it seems that it is being managed. Diffing dotfiles We can test it out by doing something like this. mv ~/.ds9/config ~/.ds9/config.old chezmoi diff  install -m 644 /dev/null /home/user/.ds9/config --- a/home/user/.ds9/config +++ b/home/user/.ds9/config @@ -0,0 +1,5 @@ +[character/sisko] +Name = Benjamin +Rank = Captain +Credentials = sisko-creds.cred +Mastodon = sisko-api.mastodon  Alright, everything looks as it should be. Apply dotfiles But that's only a diff, how do I make Chezmoi apply the changes because my dotfile is still config.old. Okay, we can actually get rid of the config.old file and make Chezmoi regenerate the configuration. rm ~/.ds9/config ~/.ds9/config.old chezmoi -v apply  Note I like to use the -v flag to check what is actually being applied. install -m 644 /dev/null /home/user/.ds9/config --- a/home/user/.ds9/config +++ b/home/user/.ds9/config @@ -0,0 +1,5 @@ +[character/sisko] +Name = Benjamin +Rank = Captain +Credentials = sisko-creds.cred +Mastodon = sisko-api.mastodon  And we get the same output as the diff. Nice! The configuration file was also recreated, that's awesome. Editing dotfiles If you've followed so far, you might have wondered… If I edit ~/.ds9/config, then Chezmoi is going to override it! YES, yes it will. warning Always use Chezmoi to edit your managed dotfiles. Do NOT edit them directly. ALWAYS use chezmoi diff before every applying. To edit your managed dotfile, simply tell Chezmoi about it. chezmoi edit ~/.ds9/config  Chezmoi will use your $EDITOR to open the file for you to edit. Once saved, it's saved in the repository database.

Be aware, at this point the changes are not reflected in your home directory, only in the Chezmoi source state. Make sure you diff and then apply to make the changes in your home.

Chezmoi repository management

As mentioned previously, the repository is found in ~/.local/share/chezmoi. I always forget where it is, luckily Chezmoi has a solution for that.

chezmoi cd


Now, we are in the repository. We can work with it as a regultar git repository. When you're done, don't forget to exit.

Other features

It is worth mentioning at this point that Chezmoi offers a few more integrations.

Templating

Due to the fact that Chezmoi is written in Golang, it can leverage the power of the Golang templating system. One can replace repeatable values like email or name with a template like {{ .email }} or {{ .name }}.

This will result in a replacement of these templated variables with their real values in the resulting dotfile. This is another reason why you should always edit your managed dotfiles through Chezmoi.

Our previous example would look a bit different.

[character/sisko]
Name = {{ .sisko.name }}
Rank = {{ .sisko.rank }}
Credentials = sisko-creds.cred
Mastodon = sisko-api.mastodon


And we would add it a bit differently now.

chezmoi add --template ~/.ds9/config


warning

Follow the documentation to configure the values.

Once you have the power of templating on your side, you can always take it one step further. Chezmoi has integration with a big list of password managers. These can be used directly into the configuration files.

In our hypothetical example, we can think of the credentials file (~/.ds9/sisko-creds.cred).

Name = {{ (keepassxc "sisko.ds9").Name }}
Rank = {{ (keepassxc "sisko.ds9").Rank }}
Access_Code = {{ (keepassxc "sisko.ds9").AccessCode }}


Do not forget that this is also using the templating engine. So you need to add as a template.

chezmoi add --template ~/.ds9/sisko-creds.cred


File encryption

Wait, what ! You almost slipped away right there old fellow.

We have our Mastodon API key in the sisko-api.mastodon file. The whole file cannot be pushed to a repository. It turns out that Chezmoi can use gpg to encrypt your files making it possible for you to push them.

To add a file encrypted to the Chezmoi repository, use the following command.

chezmoi add --encrypt ~/.ds9/sisko-api.mastodon


Misc

There is a list of other features that Chezmoi supports that I did not mention. I did not use all the features offered yet. You should check the website for the full documentation.

Conclusion

I am fully migrated into Chezmoi so far. I have used all the features above, and it has worked flawlessly so far.

I like the idea that it offers all the features I need while at the same time staying out of the way. I find myself, often, editing the dotfiles in my home directory as a dev version. Once I get to a configuration I like, I add it to Chezmoi. If I ever mess up badly, I ask Chezmoi to override my changes.

I understand it adds a little bit of overhead with the use of chezmoi commands, which I aliased to cm. But the end result is a home directory which seems untouched by any tools (no symlinks, no copies, etc…) making it easier to migrate out of Chezmoi as a solution and into another one if I ever choose in the future.

Bookmark with Org-capture

I was reading, and watching, Mike Zamansky's blog post series about org-capture and how he manages his bookmarks. His blog and video series are a big recommendation from me, he is teaching me tons every time I watch his videos. His inspirational videos were what made me dig down on how I could do what he's doing but… my way…

I stumbled across this blog post that describes the process of using org-cliplink to insert the title of the post into an org-mode link. Basically, what I wanted to do is provide a link and get an org-mode link. Sounds simple enough. Let's dig in.

Org Capture Templates

I will assume that you went through Mike's part 1 and part 2 posts to understand what org-capture-templates are and how they work. I essentially learned it from him and I do not think I can do a better job than a teacher.

Now that we understand where we need to start from, let's explain the situation. We need to find a way to call org-capture and provide it with a template. This template will need to take a url and add an org-mode url in our bookmarks. It will look something like the following.

(setq org-capture-templates
'(("b" "Bookmark (Clipboard)" entry (file+headline "~/path/to/bookmarks.org" "Bookmarks")
"** %(some-function-here-to-call)\n:PROPERTIES:\n:TIMESTAMP: %t\n:END:%?\n" :empty-lines 1 :prepend t)))


I formatted it a bit so it would have some properties. I simply used the %t to put the timestamp of when I took the bookmark. I used the %? to drop me at the end for editing. Then some-function-here-to-call a function to call to generate our bookmark section with a title.

The blog post I eluded to earlier solved it by using org-cliplink. While org-cliplink is great for getting titles and manipulating them, I don't really need that functionality. I can do it manually. Sometimes, though, I would like to copy a page… Maybe if there is a project that could attempt to do someth… Got it… org-web-tools.

Configuring org-capture with org-web-tools

You would assume that you would be able to just pop (org-web-tools-insert-link-for-url) in the previous block and you're all done. But uhhh….

Wrong number of arguments: (1 . 1), 0


No dice. What would seem to be the problem ?

We look at the definition and we find this.

(defun org-web-tools-insert-link-for-url (url)
"Insert Org link to URL using title of HTML page at URL.
If URL is not given, look for first URL in kill-ring'."
(interactive (list (org-web-tools--get-first-url)))


I don't know why, exactly, it doesn't work by calling it straight away because I do not know emacs-lisp at all. If you do, let me know. I suspect it has something to do with (interactive) and the list provided to it as arguments.

Anyway, I can see it is using org-web-tools--org-link-for-url, which the documentation suggests does the same thing as org-web-tools-insert-link-for-url, but is not exposed with (interactive). Okay, we have bits and pieces of the puzzle. Let's put it together.

First, we create the function.

(defun org-web-tools-insert-link-for-clipboard-url ()
"Extend =org-web-tools-inster-link-for-url= to take URL from clipboard or kill-ring"
(interactive)


Then, we set our org-capture-templates variable to the list of our only item.

(setq org-capture-templates
'(("b" "Bookmark (Clipboard)" entry (file+headline "~/path/to/bookmarks.org" "Bookmarks")
"** %(org-web-tools-insert-link-for-clipboard-url)\n:PROPERTIES:\n:TIMESTAMP: %t\n:END:%?\n" :empty-lines 1 :prepend t)))


Now if we copy a link into the clipboard and then call org-capture with the option b, we get prompted to edit the following before adding it to our bookmarks.

** [[https://cestlaz.github.io/stories/emacs/][Using Emacs Series - C'est la Z]]
:PROPERTIES:
:TIMESTAMP: <2020-09-17 do>
:END:


Works like a charm.

Custom URL

What if we need to modify the url in some way before providing it. I have that use case. All I needed to do is create a function that takes input from the user and provide it to org-web-tools--org-link-for-url. How hard can that be ?! uhoh! I said the curse phrase didn't I ?

(defun org-web-tools-insert-link-for-given-url ()
"Extend =org-web-tools-inster-link-for-url= to take a user given URL"
(interactive)


We can, then, hook the whole thing up to our org-capture-templates and we get.

(setq org-capture-templates
'(("b" "Bookmark (Clipboard)" entry (file+headline "~/path/to/bookmarks.org" "Bookmarks")
"** %(org-web-tools-insert-link-for-clipboard-url)\n:PROPERTIES:\n:TIMESTAMP: %t\n:END:%?\n" :empty-lines 1 :prepend t)
("B" "Bookmark (Paste)" entry (file+headline "~/path/to/bookmarks.org" "Bookmarks")
"** %(org-web-tools-insert-link-for-given-url)\n:PROPERTIES:\n:TIMESTAMP: %t\n:END:%?\n" :empty-lines 1 :prepend t)))


if we use the B, this time, it will prompt us for input.

Conclusion

I thought this was going to be harder to pull off but, alas, it was simple, even for someone who doesn't know emacs-lisp, to figure out. I hope I'd get more familiar with emacs-lisp with time and be able to do more. Until next time, I recommend you hook org-capture into your workflow. Make sure it fits your work style, otherwise you will not use it, and make your path a more productive one.

My journey in the Kubernetes Release Team

My learnings from working on the Kubernetes Release Team and leading the enhancements vertical

Concurrency bugs in Go

I recently read this paper titled, Understanding Real-World Concurrency Bugs in Go (PDF), that studies concurrency bugs in Golang and comments on the new primitives for messages passing that the language is often known for.

I am not a very good Go programmer, so this was an informative lesson in various ways to achieve concurrency and synchronization between different threads of execution. It is also a good read for experienced Go developers as it points out some important gotchas to look out for when writing Go code. The fact that it uses real world examples from well known projects like Docker, Kubernetes, gRPC-Go, CockroachDB, BoltDB etc. makes it even more fun to read!

The authors analyzed a total of 171 concurrency bugs from several prominent Go open source projects and categorized them in two orthogonal dimensions, one each for the cause of the bug and the behavior. The cause is split between two major schools of concurrency

Along the cause dimension, we categorize bugs into those that are caused by misuse of shared memory and those caused by misuse of message passing

and the behavior dimension is similarly split into

we separate bugs into those that involve (any number of ) goroutines that cannot proceed (we call themblocking bugs) and those that do not involve any blocking (non-blocking bugs)

Interestingly, they chose the behavior to be blocking instead of deadlock since the former implies that atleast one thread of execution is blocked due to some concurrency bug, but the rest of them might continue execution, so it is not a deadlock situation.

Go has primitive shared memory protection mechanisms like Mutex, RWMutex etc. with a caveat

Write lock requests in Go have ahigher privilege than read lock requests.

as compared to pthread in C. Go also has a new primitive called sync.Once that can be used to guarantee that a function is executed only once. This can be useful in situations where some callable is shared across multiple threads of execution but it shouldn't be called more than once. Go also has sync.WaitGroups , which is similar to pthread_join to wait for various threads of executioun to finish executing.

Go also uses channels for the message passing between different threads of executions called Goroutunes. Channels can be buffered on un-buffered (default), the difference between them being that in a buffered channel the sender and receiver don't block on each other (until the buffered channel is full).

The study of the usage patterns of these concurrency primitives in various code bases along with the occurence of bugs in the codebase concluded that even though message passing was used at fewer places, it accounted for a larger number of bugs(58%).

Implication 1:With heavier usages of goroutines and newtypes of concurrency primitives, Go programs may potentiallyintroduce more concurrency bugs

Also, interesting to note is this observation in tha paper

Observation 5:All blocking bugs caused by message passing are related to Go’s new message passing semantics like channel. They can be difficult to detect especially when message passing operations are used together with other synchronization mechanisms

The authors also talk about various ways in which Go runtime can detect some of these concurrency bugs. Go runtime includes a deadlock detector which can detect when there are no goroutunes running in a thread, although, it cannot detect all the blocking bugs that authors found by manual inspection.

For shared memory bugs, Go also includes a data race detector which can be enbaled by adding -race option when building the program. It can find races in memory/data shared between multiple threads of execution and uses happened-before algorithm underneath to track objects and their lifecycle. Although, it can only detect a part of the bugs discovered by the authors, the patterns and classification in the paper can be leveraged to improve the detection and build more sophisticated checkers.

My Rubber Ducks

There are times when I find myself stuck when solving any problem. This deadlock can arise due to several factors. Somet...

Url Shortner in Golang

TLDR; Trying to learn new things I tried writing a URL shortner called shorty. This is a first draft and I am trying to approach it from first principle basis. Trying to break down everything to the simplest component.

I decided to write my own URL shortner and the reason for doing that was to dive a little more into golang and to learn more about systems. I have planned to not only document my learning but also find and point our different ways in which this application can be made scalable, resilient and robust.

A high level idea is to write a server which takes the big url and return me a short url for the same. I have one more requirement where I do want to provide a slug i.e a custom short url path for the same. So for some links like https://play.google.com/store/apps/details?id=me.farhaan.bubblefeed, I want to have a url like url.farhaan.me/linktray which is easy to remember and distribute.

The way I am thinking to implement this is by having two components, I want a CLI interface which talks to my Server. I don’t want a fancy UI for now because I want it to be exclusively be used through terminal. A Client-Server architecture, where my CLI client sends a request to the server with a URL and an optional slug. If a slug is present URL will have that slug in it and if it doesn’t it generates a random string and make the URL small. If you see from a higher level it’s not just a URL shortner but also a URL tagger.

The way a simple url shortner works:

A client makes a request to make a given URL short, server takes the URL and stores it to the database, server then generates a random string and maps the URL to the string and returns a URL like url.farhaan.me/<randomstring>.

Now when a client requests to url.farhaan.me/<randomstring>, it goest to the same server, it searches the original URL and redirects the request to a different website.

The slug implementation part is very straightforward, where given a word, I might have to search the database and if it is already present we raise an error but if it isn’t we add it in the database and return back the URL.

One optimization, since it’s just me who is going to use this, I can optimize my database to see if the long URL already exists and if it does then no need to create a new entry. But this should only happen in case of random string and not in case of slugs. Also this is a trade off between reducing the redundancy and latency of a request.

But when it comes to generating a random string, things get a tiny bit complicated. This generation of random strings, decides how many URLs you can store. There are various hashing algorithms that I can use to generate a string I can use md5, base10 or base64. I also need to make sure that it gives a unique hash and not repeated ones.

Unique hash can be maintained using a counter, the count either can be supplied from a different service which can help us to scale the system better or it can be internally generated, I have used database record number for the same.

If you look at this on a system design front. We are using the same Server to take the request and generate the URL and to redirect the request. This can be separated into two services where one service is required to generate the URL and the other just to redirect the URL. This way we increase the availability of the system. If one of the service goes down the other will still function.

The next step is to write and integrate a CLI system to talk to the server and fetch the URL. A client that can be used for an end user. I am also planning to integrate a caching mechanism in this but not something out of the shelf rather write a simple caching system with some cache eviction policy and use it.

Till then I will be waiting for the feedback. Happy Hacking.

I now have a Patreon open so that you folks can support me to do this stuff for longer time and sustain myself too. So feel free to subscribe to me and help me keeping doing this with added benefits.

Farhaan Bukhsh

TLDR; Link Tray is a utility we recently wrote to curate links from different places and share it with your friends. The blogpost has technical details and probably some productivity tips.

Link Bubble got my total attention when I got to know about it, I felt it’s a very novel idea, it helps to save time and helps you to curate the websites you visited. So on the whole, and believe me I am downplaying it when I say Link Bubble does two things:

1. Saves time by pre-opening the pages
2. Helps you to keep a track of pages you want to visit

It’s a better tab management system, what I felt weird was building a whole browser to do that. Obviously, I am being extremely naive when I am saying it because I don’t know what it takes to build a utility like that.

Now, since they discontinued it for a while and I never got a chance to use it. I thought let me try building something very similar, but my use case was totally different. Generally when I go through blogs or articles, I open the links mentioned in a different tab to come back to them later. This has back bitten me a lot of time because I just get lost in so many links.

I thought if there is a utility which could just capture the links on the fly and then I could quickly go through them looking at the title, it might ease out my job. I bounced off the same idea across to Abhishek and we ended up prototyping LinkTray.

Our first design was highly inspired by facebook messenger but instead of chatheads we have links opened. If you think about it the idea feels very beautiful but the design is “highly” not scalable. For example if you have as many as 10 links opened we had trouble in finding our links of interest which was a beautiful design problems we faced.

We quickly went to the whiteboard and put up a list of requirements, first principles; The ask was simple:

1. To share multiple links with multiple people with least transitions
2. To be able to see what you are sharing

We took inspiration from an actual Drawer where we flick out a bunch of links and go through them. In a serendipitous moment the design came to us and that’s how link tray looks like the way it looks now.

Link Tray was a technical challenge as well. There is a plethora of things I learnt about the Android ecosystem and application development that I knew existed but never ventured into exploring it.

Link Tray is written in Java, and I was using a very loosely maintained library to get the overlay activity to work. Yes, the floating activity or application that we see is called an overlay activity, this allows the application to be opened over an already running application.

The library that I was using doesn’t have support for Android O and above. To figure that out it took me a few nights , also because I was hacking on the project during nights . After reading a lot of GitHub issues I figured out the problem and put in the support for the required operating system.

One of the really exciting features that I explored about Android is Services. I think I might have read most of the blogs out there and all the documentation available and I know that I still don't know enough. I was able to pick enough pointers to make my utility to work.

Just like Uncle Bob says make it work and then make it better. There was a persistent problem, the service needs to keep running in the background for it to work. This was not a functional issue but it was a performance issue for sure and our user of version 1.0 did have a problem with it. People got mislead because there was constant notification that LinkTray is running and it was annoying. This looked like a simple problem on the face but was a monster in the depth.

The solution to the problem was simple stop the service when the tray is closed, and start the service when the link is shared back to link tray. Tried, the service did stop but when a new link was shared the application kept crashing. Later I figured out the bound service that is started by the library I am using is setting a bound flag to True but when they are trying to reset this flag , they were doing at the wrong place, this prompted me to write this StackOverflow answer to help people understand the lifecycle of service. Finally after a lot of logs and debugging session I found the issue and fixed it. It was one of the most exciting moment and it help me learn a lot of key concepts.

The other key learning, I got while developing Link Tray was about multi threading, what we are doing here is when a link is shared to link tray, we need the title of the page if it has and favicon for the website. Initially I was doing this on the main UI thread which is not only an anti-pattern but also a usability hazard. It was a network call which blocks the application till it was completed, I learnt how to make a network call on a different thread, and keep the application smooth.

Initially approach was to get a webview to work and we were literally opening the links in a browser and getting the title and favicon out, this was a very heavy process. Because we were literally spawning a browser to get information about links, in the initial design it made sense because we were giving an option to consume the links. Over time our design improved and we came to a point where we don’t give the option to consume but to curate. Hence we opted for web scraping, I used custom headers so that we don’t get caught by robot.txt. And after so much of effort it got to a place where it is stable and it is performing great.

It did take quite some time to reach a point where it is right now, it is full functional and stable. Do give it a go if you haven’t, you can shoot any queries to me.

Happy Hacking!

GNU Emacs pretest builds for Fedora

I have been following GNU Emacs development through Debbugs and Sacha Chua’s newsletter. I always felt that I should use the latest development version of Emacs, instead of sticking to stable release. That way I get to use the latest improvements and also help in testing the changes. If I find any bugs, I can report those. The motivation for building pretests I was planning to build RPM packages for Fedora from master branch.

Holiday Greetings

I'm on vacation at the North Sea with my family, and like exactly one year ago I was facing the problem of having too many postcards to write. Last year, I had written a small Python script that would take a yaml file and compile it to an HTML postcard.

The yaml describes all adjustable parts of the postcard, like the content and address, but also a title, stamp and front image. A jinja2 template, a bit of CSS and javascript create a flipable postcard that can be sent via email - which is very convenient if you, like me, are too lazy to buy postcards and stamps, and have more email addresses in your address book than physical addresses.

A postcard yaml could look like this (click the card to flip it around):

---

- name: Holiday Status 2020
front_image: 'private_images/ninja.jpg'
Schubisu's Blog
World Wide Web
title: I'm fine, thanks :)
content: |
Hey there!
I'm currently on vacation and was stumbling over the same problem I had last year; writing greeting cards for friends and family. Luckily I've solved that issue last year, I simply had totally forgotten about it.
This is an electronic postcard, made of HTML, CSS and a tiny bit of javascript, compiling my private photos and messages to a nice looking card.
Feel free to fork, use and add whatever you like!
Greets,
Schubisu
stamp: 'private_images/leuchtturm_2020.jpg'


and will be rendered by the script to this:

I was curious anyway, how this would be rendered on my blog. I've added a small adjustment to my CSS to scale the iframe tag by 0.75% and I'm okay with the result ;)

Write your own postcard or add some features! You can find the repository here: https://gitlab.com/schubisu/postcard.

Transitioning to Windows

So, recently I started using windows for work. Why? There are a couple of reasons, one that I needed to use MSVC, that is the Microsoft Visual C++ toolchain and the other being, I wasn’t quite comfortable to ifdef stuff for making it work on GCC aka, the GNU counterpart of MSVC.

Krita Weekly #14

After an anxious month, I am writing a Krita Weekly again and probably this would be my last one too, though I hope not. Let’s start by talking about bugs. Unlike the trend going about the last couple of months, the numbers have taken a serious dip.

Bikeshedding

http://bikeshed.org/

What color should I paint the bike-shed?

Handling nested serializer validations in Django Rest Framework

I understand that the title of this post is a little confusing. Recently, while working on the Projects API in Weblate, I came across an interesting issue. The Projects API in Weblate allowed you to get an attribute called source_language. Every project has only one source_language and in the API, it was a read-only property.

{
"name": "master_locales",
"slug": "master_locales",
"web": "https://example.site",
"source_language": {
"code": "en",
"name": "English",
"direction": "ltr",
"web_url": "http:/example.site/languages/en/",
"url": "http://example.site/api/languages/en/"
},
"web_url": "http://example.site/projects/master_locales/",
"url": "http://example.site/api/projects/master_locales/",
"components_list_url": "http://example.site/api/projects/master_locales/components/",
"repository_url": "http://example.site/api/projects/master_locales/repository/",
"statistics_url": "http://example.site/api/projects/master_locales/statistics/",
"changes_list_url": "http://example.site/api/projects/master_locales/changes/",
"languages_url": "http://example.site/api/projects/master_locales/languages/"
}


As you can see, unlike the other relational fields, it's not a HyperLinkedIdentityField. It uses the nested language serializer to show all the attributes of the source_language.

Now, previously, when a project was created via API, a default language was always assigned to the project and there was no way to define the source_language while creating the project via API.

Problem?

Doing GET on Language Serializer when sending POST on Project Serializer

So we needed to add the feature to define the source_language of the project when we send a POST request to the Project API. And also edit the project via API to update the source_language. So, to use the same serializer, the request body for the POST request would look something like this:

{
"name": "master_locales",
"slug": "master_locales",
"web": "https://example.site",
"source_language": {
"code": "ru",
"name": "Russian",
"direction": "ltr",
}
}


Now, in general, we would have a python serializer like this:

class LanguageSerializer(serializers.ModelSerializer):
class Meta:
model = Language
fields = ("code", "name", "direction", "web_url", "url")
extra_kwargs = {
"url": {"view_name": "api:language-detail", "lookup_field": "code"}
}

class ProjectSerializer(serializers.ModelSerializer):
source_language = LanguageSerializer(required=False)
# ...
# Other parts of the serializer


The problem with having code like this is, when the ProjectSerializer gets a request like shown above and tries to validate the data in the request, it also validates the LanguageSerializer part. The LanguageSerializer part whenever it gets data, it will automatically try to validate the data. The code property of Language model has a unique constraint. So, when LanguageSerializer tries to validate

{
"code": "ru",
"name": "Russian",
"direction": "ltr",
}


it will throw an error "This field must be unique" for code property in case a language with codename ru already exists in the database.

Solution

So there are few steps to get this done.

Remove validators from code field

extra_kwargs = {
"url": {"view_name": "api:language-detail", "lookup_field": "code"},
"code": {"validators": []},
}


Add "code": {"validators": []} to the extra_kwargs to remove the validator from the LanguageSerializer on every data request it receives.

Add manual validation for code field

Removing validator will also remove the validation while doing POST request. Now, the LanguageSerializer in Weblate specifically doesn't support POST, but in any case, you would manually need to add a validation function to the LanguageSerializer so if someone checks for validity before adding language, it throws an error. To do that, add a function validate_code like this:

def validate_code(self, value):
check_query = Language.objects.filter(code=value)
if check_query.exists() and not (
isinstance(self.parent, ProjectSerializer)
and self.field_name == "source_language"
):
raise serializers.ValidationError(
"Language with this Language code already exists."
)
if not check_query.exists():
raise serializers.ValidationError(
)
return value


Note: The name of the function must be validate_{field_name} when you are trying to validate a field based on how DRF handles validation.

Overwrite create() in ProjectSerializer

Finally, we would want to overwrite the create() function of ProjectSerializer to:

• Validate source_language data using the above validation to check if the language with that code exists
• Modify source_language key of the validated_data to have the Language model object rather than the dictionary, so it can be used to create a project with the foreign key.
• Lastly, create a project with the new validata_data

The code would look something like this:

def create(self, validated_data):
source_language_validated = validated_data.get("source_language")
if source_language_validated:
validated_data["source_language"] = Language.objects.get(
code=source_language_validated.get("code")
)
project = Project.objects.create(**validated_data)
return project


And now, if you create a project, using the source_language key, you can define the source language for the project while using the Project API. There might be several other ways to go about it. But this is one of the ways I found works.

Also, this feature is now live in Weblate 4.* versions which allows you to define the source_language via the API.

git config

Git comes with this handy tool that let’s you manage your Git configuration with ease. Configuration Levels --local (Default) Local configuration values are stored and managed at the repository level. The values are stored in a file found in the repo’s .git directory: .git/config If you are planning to set some configuration only to be used by the specific repo you are in, then go ahead with using --local.

Creating Custom Whoosh Plugin

Recently, while trying to work on a query parser feature in Weblate, I came across this search engine library called Whoosh. It provides certain nice features like indexing of text, parsing of search queries, scoring algorithms, etc. One good thing about this library is most of these features are customizable and extensible.

Now, the feature I was trying to implement is an exact search query. An exact search query would behave in a way such that the backend would search for an exact match of any query text provided to it instead of the normal substring search. Whoosh provides a plugin for regex, which can be accessed via whoosh.qparser.RegexPlugin(). So we can technically go about writing a regex to do the exact match. But a regex search will have worse performance than a simple string comparison.

So, one of the ways of doing a new kind of query parsing is creating a custom whoosh plugin. And that's what this blog is going to be about.

Simple Whoosh Plugin

In some cases, you will probably not need a complicated plugin, but just want to extend the feature of an existing plugin to match a different kind of query. For example, let's say you want to extend the ability of SingleQuotePlugin to parse queries wrapped in either single-quotes or double-quotes.

class QuotePlugin(whoosh.qparser.SingleQuotePlugin):
"""Single and double quotes to specify a term."""
expr = r"(^|(?<=\W))['\"](?P<text>.*?)['\"](?=\s|\]|[)}]|$)"  In the above example, QuotePlugin extends the already existing SingleQuotePlugin class. It just overrides the expression to parse the query. The expression, mentioned in the variable expr is usually a regex expression with ?P<text> part denoting the TermQuery. A TermQuery is the final term/terms searched for in the database. So in the above regex, we say to parse any query such that the TermQuery is wrapped in between single-quotes or double-quotes. Query Class A query class is the class, whose instance the final parsed term will be. Unless otherwise mentioned, it's usually <Term>. So if we want our plugin to parse the query and show it as an instance of a custom class, we need to define a custom query class. class Exact(whoosh.query.Term): """Class for queries with exact operator.""" pass  So, as you can say, we can just have a simple class just extending whoosh.query.Term so that while checking the parsed terms, we can get is as an instance of Exact. That will help us differentiate the query from a normal Term instance. Custom Whoosh Plugin After writing the query class, we will need to write the custom plugin class. class ExactPlugin(whoosh.qparser.TaggingPlugin): """Exact match plugin with quotes to specify an exact term.""" class ExactNode(whoosh.qparser.syntax.TextNode): qclass = Exact def r(self): return "Exact %r" % self.text expr = r"\=(^|(?<=\W))(['\"]?)(?P<text>.*?)\2(?=\s|\]|[)}]|$)"
nodetype = ExactNode


In the above example, unlike the simple case, we extend TaggingPlugin instead of any other pre-defined plugin. Most of the pre-defined plugins in whoosh also extend TaggingPlugin. So it is a good fit as a parent class.

Then, we create a ExactNode class. This we will assign to the node type for the custom plugin. A node type class basically defines the query class to be used in this custom plugin, along with various representations and properties of the parsed node. qclass will have the query class created before to denote the Exact instance to the final parsed term.

Apart from that, we have the expr which contains the regex just like in the simple example to parse the query term.

Finally...

After creating the custom plugin, you can:

• add this plugin to the list of plugins defined in the whoosh query parser class
• use the query class to make an isinstance() check when making database queries
• check for the node type in the different nodes used by the parser

Using Docker with Ansible

[Published in Open Source For You (OSFY) magazine, October 2017 edition.]

This article is the eighth in the DevOps series. In this issue, we shall learn to set up Docker in the host system and use it with Ansible.

Introduction

Docker provides operating system level virtualisation in the form of containers. These containers allow you to run standalone applications in an isolated environment. The three important features of Docker containers are isolation, portability and repeatability. All along we have used Parabola GNU/Linux-libre as the host system, and executed Ansible scripts on target Virtual Machines (VM) such as CentOS and Ubuntu.

Docker containers are extremely lightweight and fast to launch. You can also specify the amount of resources that you need such as CPU, memory and network. The Docker technology was launched in 2013, and released under the Apache 2.0 license. It is implemented using the Go programming language. A number of frameworks have been built on top of Docker for managing these cluster of servers. The Apache Mesos project, Google’s Kubernetes, and the Docker Swarm project are popular examples. These are ideal for running stateless applications and help you to easily scale them horizontally.

Setup

The Ansible version used on the host system (Parabola GNU/Linux-libre x86_64) is 2.3.0.0. Internet access should be available on the host system. The ansible/ folder contains the following file:

ansible/playbooks/configuration/docker.yml

Installation

The following playbook is used to install Docker on the host system:

---
- name: Setup Docker
hosts: localhost
gather_facts: true
become: true
tags: [setup]

- name: Update the software package repository
pacman:
update_cache: yes

- name: Install dependencies
package:
name: "{{ item }}"
state: latest
with_items:
- python2-docker
- docker

- service:
name: docker
state: started

- name: Run the hello-world container
docker_container:
name: hello-world
image: library/hello-world

The Parabola package repository is updated before proceeding to install the dependencies. The python2-docker package is required for use with Ansible. Hence, it is installed along with the docker package. The Docker daemon service is then started and the library/hello-world container is fetched and executed. A sample invocation and execution of the above playbook is shown below:

$ansible-playbook playbooks/configuration/docker.yml -K --tags=setup SUDO password: PLAY [Setup Docker] ************************************************************* TASK [Gathering Facts] ********************************************************** ok: [localhost] TASK [Update the software package repository] *********************************** changed: [localhost] TASK [Install dependencies] ***************************************************** ok: [localhost] => (item=python2-docker) ok: [localhost] => (item=docker) TASK [service] ****************************************************************** ok: [localhost] TASK [Run the hello-world container] ******************************************** changed: [localhost] PLAY RECAP ********************************************************************** localhost : ok=5 changed=2 unreachable=0 failed=0  With verbose ’-v’ option to ansible-playbook, you will see an entry for LogPath, such as /var/lib/docker/containers//-json.log. In this log file you will see the output of the execution of the hello-world container. This output is the same when you run the container manually as shown below: $ sudo docker run hello-world

Hello from Docker!

This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it

To try something more ambitious, you can run an Ubuntu container with:
$docker run -it ubuntu bash Share images, automate workflows, and more with a free Docker ID: https://cloud.docker.com/ For more examples and ideas, visit: https://docs.docker.com/engine/userguide/ Example A Deep Learning (DL) Docker project is available (https://github.com/floydhub/dl-docker) with support for frameworks, libraries and software tools. We can use Ansible to build the entire DL container from the source code of the tools. The base OS of the container is Ubuntu 14.04, and will include the following software packages: • Tensorflow • Caffe • Theano • Keras • Lasagne • Torch • iPython/Jupyter Notebook • Numpy • SciPy • Pandas • Scikit Learn • Matplotlib • OpenCV The playbook to build the DL Docker image is given below: - name: Build the dl-docker image hosts: localhost gather_facts: true become: true tags: [deep-learning] vars: DL_BUILD_DIR: "/tmp/dl-docker" DL_DOCKER_NAME: "floydhub/dl-docker" tasks: - name: Download dl-docker git: repo: https://github.com/saiprashanths/dl-docker.git dest: "{{ DL_BUILD_DIR }}" - name: Build image and with buildargs docker_image: path: "{{ DL_BUILD_DIR }}" name: "{{ DL_DOCKER_NAME }}" dockerfile: Dockerfile.cpu buildargs: tag: "{{ DL_DOCKER_NAME }}:cpu" We first clone the Deep Learning docker project sources. The docker_image module in Ansible helps us to build, load and pull images. We then use the Dockerfile.cpu file to build a Docker image targeting the CPU. If you have a GPU in your system, you can use the Dockerfile.gpu file. The above playbook can be invoked using the following command: $ ansible-playbook playbooks/configuration/docker.yml -K --tags=deep-learning

Depending on the CPU and RAM you have, it will take considerable amount of time to build the image with all the software. So be patient!

Jupyter Notebook

The built dl-docker image contains Jupyter notebook which can be launched when you start the container. An Ansible playbook for the same is provided below:

- name: Start Jupyter notebook
hosts: localhost
gather_facts: true
become: true
tags: [notebook]

vars:
DL_DOCKER_NAME: "floydhub/dl-docker"

- name: Run container for Jupyter notebook
docker_container:
name: "dl-docker-notebook"
image: "{{ DL_DOCKER_NAME }}:cpu"
state: started
command: sh run_jupyter.sh

You can invoke the playbook using the following command:

$ansible-playbook playbooks/configuration/docker.yml -K --tags=notebook The Dockerfile already exposes the port 8888, and hence you do not need to specify the same in the above docker_container configuration. After you run the playbook, using the ‘docker ps’ command on the host system, you can obtain the container ID as indicated below: $ sudo docker ps
CONTAINER ID        IMAGE                    COMMAND               CREATED             STATUS              PORTS                NAMES
a876ad5af751        floydhub/dl-docker:cpu   "sh run_jupyter.sh"   11 minutes ago      Up 4 minutes        6006/tcp, 8888/tcp   dl-docker-notebook

You can now login to the running container using the following command:

$sudo docker exec -it a876 /bin/bash You can then run an ‘ifconfig’ command to find the local IP address (“172.17.0.2” in this case), and then open http://172.17.0.2:8888 in a browser on your host system to see the Jupyter Notebook. A screenshot is shown in Figure 1: TensorBoard TensorBoard consists of a suite of visualization tools to understand the TensorFlow programs. It is installed and available inside the Docker container. After you login to the Docker container, at the root prompt, you can start Tensorboard by passing it a log directory as shown below: # tensorboard --logdir=./log You can then open http://172.17.0.2:6006/ in a browser on your host system to see the Tensorboard dashboard as shown in Figure 2: Docker Image Facts The docker_image_facts Ansible module provides useful information about a Docker image. We can use it to obtain the image facts for our dl-docker container as shown below: - name: Get Docker image facts hosts: localhost gather_facts: true become: true tags: [facts] vars: DL_DOCKER_NAME: "floydhub/dl-docker" tasks: - name: Get image facts docker_image_facts: name: "{{ DL_DOCKER_NAME }}:cpu" The above playbook can be invoked as follows: $ ANSIBLE_STDOUT_CALLBACK=json ansible-playbook playbooks/configuration/docker.yml -K --tags=facts 

The ANSIBLE_STDOUT_CALLBACK environment variable is set to ‘json’ to produce a JSON output for readability. Some important image facts from the invocation of the above playbook are shown below:

"Architecture": "amd64",
"Author": "Sai Soundararaj <saip@outlook.com>",

"Config": {

"Cmd": [
"/bin/bash"
],

"Env": [
"PATH=/root/torch/install/bin:/root/caffe/build/tools:/root/caffe/python:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"CAFFE_ROOT=/root/caffe",
"PYCAFFE_ROOT=/root/caffe/python",
"PYTHONPATH=/root/caffe/python:",
"LUA_PATH=/root/.luarocks/share/lua/5.1/?.lua;/root/.luarocks/share/lua/5.1/?/init.lua;/root/torch/install/share/lua/5.1/?.lua;/root/torch/install/share/lua/5.1/?/init.lua;./?.lua;/root/torch/install/share/luajit-2.1.0-beta1/?.lua;/usr/local/share/lua/5.1/?.lua;/usr/local/share/lua/5.1/?/init.lua",
"LD_LIBRARY_PATH=/root/torch/install/lib:",
"DYLD_LIBRARY_PATH=/root/torch/install/lib:"
],

"ExposedPorts": {
"6006/tcp": {},
"8888/tcp": {}
},

"Created": "2016-06-13T18:13:17.247218209Z",
"DockerVersion": "1.11.1",

"Os": "linux",

"task": { "name": "Get image facts" }

You are encouraged to read the ‘Getting Started with Docker’ user guide available at http://docs.ansible.com/ansible/latest/guide_docker.html to know more about using Docker with Ansible.

Testing the next-gen pip dependency resolver

This is an attempt to summarize the broader software architecture around dependency resolution in pip and how testing is being done around this area.

The motivation behind writing this, is to make sure all the developers working on this project are on the same page, and to have a written record about the state of affairs.

Architecture

The “legacy” resolver in pip, is implemented as part of pip’s codebase and has been a part of it for many years. It’s very tightly coupled with the existing code, isn’t easy to work with and has severe backward compatibility concerns with modifying directly – which is why we’re implementing a separate “new” resolver in this project, instead of trying to improve the existing one.

The “new” resolver that is under development, is not implemented as part of pip’s codebase; not completely anyway. We’re using an abstraction that separates all the metadata-generation-and-handling stuff vs the core algorithm. This allows us to work on the core algorithm logic (i.e. the NP-hard search problem) separately from pip-specific logic (eg. download, building etc). The abstraction and core algorithm are written/maintained in https://github.com/sarugaku/resolvelib right now. The pip-specific logic for implementing the “other side” of the abstraction is in https://github.com/pypa/pip/tree/master/src/pip/_internal/resolution/resolvelib.

Testing

In terms of testing, we have dependency-resolution-related tests in both resolvelib and pip.

resolvelib

The tests in resolvelib are intended more as “check if the algorithm does things correctly” and even contains tests that are agnostic to the Python ecosystem (eg. we’ve borrowed tests from Ruby, Swift etc). The goal here is to make sure that the core algorithm we implement is capable of generating correct answers (for example: not getting stuck in looping on the same “requirement”, not revisiting rejected nodes etc).

pip

The tests in pip is where I’ll start needing more words to explain what’s happening. :)

YAML-based tests

We have “YAML” tests which I’d written back in 2017, as a format to easily write tests for pip’s new resolver when we implement it. However, since we didn’t have a need for it to be working completely back then (there wasn’t a new resolver to test with it!), the “harness” for running these tests isn’t complete and would likely need some work to be as feature complete as we’d want it to be, for writing good tests.

“new” resolver tests

unit tests

We have some unit tests for the new resolver implementation. These cover very basic “sanity checks” to ensure it follows the “contract” of the abstraction, like “do the candidates returned by a requirement actually satisfy that requirement?”. These likely don’t need to be touched, since they’re fairly well scoped and test fairly low-level details (i.e. ideal for unit tests).

New resolver unit tests: https://github.com/pypa/pip/tree/master/tests/unit/resolution_resolvelib

functional tests

We also have “new resolver functional tests”, which are written as part of the current work. These exist since how-to-work-with-YAML-tests was not an easy question to answer and there needs to be work done (both on the YAML format, as well as the YAML test harness) to flag which tests should run with which resolver (both, only legacy, only new) and make it possible to put run these tests in CI easily.

New resolver functional tests: https://github.com/pypa/pip/blob/master/tests/functional/test_new_resolver.py

test_install*.py

These files test all the functionality of the install command (like: does it use the right build dependencies, does it download the correct files, does it write the correct metadata etc). There might be some dependency-resolution-related tests in test_install*.py files.

These files contain a lot of tests so, ideally, at some point, someone would go through and de-duplicate tests from this as well.

How can you help?

If you use pip, there are a multiple ways that you can help us!

• First and most fundamentally, please help us understand how you use pip by talking with our user experience researchers. You can do this right now! You can take a survey, or have a researcher interview you over a video call. Please sign up and spread the word to anyone who uses pip (even a little bit).

• Right now, even before we release the new resolver as a beta, you can help by running pip check on your current environment. This will report if you have any inconsistencies in your set of installed packages. Having a clean installation will make it much less likely that you will hit issues when the new resolver is released (and may address hidden problems in your current environment!). If you run pip check and run into stuff you can’t figure out, please ask for help in our issue tracker or chat.

Thanks to Paul Moore and Tzu-Ping for help in reviewing and writing this post, as well as Sumana Harihareswara for suggesting to put this up on my blog!

OSS Work update #8

I’m trying to post these roughly once a month. Here’s the January post.

I am working on open source projects, as part of an internship at FOSSEE and as a part of grant-funded work on pip’s dependency resolver.

Work I did (Jan 6 - Feb 5)

Technical

• Co-worked with another developer, in person, for 1 week, on pip!
• Triaged pip’s issue tracker (a lot).
• Spend some time improving pip’s test suite infrastructure.
• Investigated Python 2 usage, to identify anomalies.
• Helped with virtualenv 20.0 release (kinda!).
• Invested effort to improve pip’s test suite
• Helped aggregate test cases for pip’s next generation resolver.

Communication

• Managed the pip 20.0 release fiasco.
• Helped the UX folks get started with working on pip.

January has been a very productive month.

Most of the challenges have been the logistics around work, not the work as such.

My health has been pretty good and there’s a certain flow to my work that I’m enjoying now. Turns out, if you like what you’re doing, you tend to be pretty productive! :)

As long as I remember to push my blog posts to the repository, they’ll actually go live on the day they’re supposed to.

Goals for February 2020

Technical

• Internal Cleansing: AKA Technical debt down payment.
• Issue triage: Triage a fair number of issues on pip’s issue tracker.
• Technical Documentation: improving pip’s technical documentation, for contributors and developers

Communication

• Help all the other contractors to get up to “full speed” for working on pip
• Get PyPA to participate in GSoC 2020
• Python Packaging Summit at PyCon US 2020: help organization.
• Move forward on Python Packaging Governance

None.

Help us

How can you help us?

• provide test cases where the latest released version of pip (19.3.1, at the time of writing) fails to resolve dependencies properly (on zazo’s issue tracker). They will help us design and test the new resolver.
• talk with your company about becoming a PSF sponsor. The Fundable Packaging Improvements page lists fairly well-scoped projects that would happen much faster if we get funding to achieve them.
• Have an interview with our UX expert, who is working to improve usability of Python Packaging tooling.

"isn't a title of this post" isn't a title of this post

[NOTE: This post originally appeared on deepsource.io, and has been posted here with due permission.]

In the early part of the last century, when David Hilbert was working on stricter formalization of geometry than Euclid, Georg Cantor had worked out a theory of different types of infinities, the theory of sets. This theory would soon unveil a series of confusing paradoxes, leading to a crisis in the Mathematics community  regarding the stability of the foundational principles of the math of that time.

Central to these paradoxes was the Russell’s paradox (or more generally, as we’d talk about later, the Epimenides Paradox). Let’s see what it is.

In those simpler times, you were allowed to define a set if you could describe it in English. And, owing to mathematicians’ predilection for self-reference, sets could contain other sets.

Russell then, came up with this:

$$R$$  is a set of all the sets which do not contain themselves.

The question was "Does $$R$$ contain itself?" If it doesn’t, then according to the second half of the definition it should. But if it does, then it no longer meets the definition.

The same can symbolically be represented as:

Let $$R = \{ x \mid x \not \in x \}$$, then $$R \in R \iff R \not \in R$$

Cue mind exploding.

“Grelling’s paradox” is a startling variant which uses adjectives instead of sets. If adjectives are divided into two classes, autological (self-descriptive) and heterological (non-self-descriptive), then, is ‘heterological’ heterological? Try it!

Or, the so-called Liar Paradox was another such paradox which shred apart whatever concept of ‘computability’ was, at that time - the notion that things could either be true or false.

Epimenides was a Cretan, who made one immortal statement:

“All Cretans are liars.”

If all Cretans are liars, and Epimenides was a Cretan, then he was lying when he said that “All Cretans are liars”. But wait, if he was lying then, how can we ‘prove’ that he wasn’t lying about lying? Ein?

This is what makes it a paradox: A statement so rudely violating the assumed dichotomy of statements into true and false, because if you tentatively think it’s true, it backfires on you and make you think that it is false. And a similar backfire occurs if you assume that the statement is false. Go ahead, try it!

If you look closely, there is one common culprit in all of these paradoxes, namely ‘self-reference’. Let’s look at it more closely.

Strange Loopiness

If self-reference, or what Douglas Hofstadter - whose prolific work on the subject matter has inspired this blog post - calls ‘Strange Loopiness’ was the source of all these paradoxes, it made perfect sense to just banish self-reference, or anything which allowed it to occur. Russell and Whitehead, two rebel mathematicians of the time, who subscribed to this point of view, set forward and undertook the mammoth exercise, namely “Principia Mathematica”, which we as we will see in a little while, was utterly demolished by Gödel’s findings.

The main thing which made it difficult to ban self-reference was that it was hard to pin point where exactly did the self-reference occur. It may as well be spread out over several steps, as in this ‘expanded’ version of Epimenides:

The next statement is a lie.

The previous statement is true.

Russell and Whitehead, in P.M. then, came up with a multi-hierarchy set theory to deal with this. The basic idea was that a set of the lowest ‘type’ could only contain ‘objects’ as members (not sets). A set of the next type could then only either contain objects, or sets of lower types. This, implicitly banished self-reference.

Since, all sets must have a type, a set ‘which contains all sets which are not members of themselves’ is not a set at all, and thus you can say that Russell’s paradox was dealt with.

Similarly, if an attempt is made towards applying the expanded Epimenides to this theory, it must fail as well, for the first sentence to make a reference to the second one, it has to be hierarchically above it - in which case, the second one can’t loop back to the first one.

Thirty one years after David Hilbert set before the academia to rigorously demonstrate that the system defined in Principia Mathematica was both consistent (contradiction-free) and complete (i.e. every true statement could be evaluated to true within the methods provided by P.M.), Gödel published his famous Incompleteness Theorem. By importing the Epimenides Paradox right into the heart of P.M., he proved that not just the axiomatic system developed by Russell and Whitehead, but none of the axiomatic systems whatsoever were complete without being inconsistent.

Clear enough, P.M. lost it’s charm in the realm of academics.

Before Gödel’s work too, P.M. wasn’t particularly loved as well.

Why?

It isn’t just limited to this blog post, but we humans, in general, have a diet for self-reference - and this quirky theory severely limits our ability to abstract away details - something which we love, not only as programmers, but as linguists too - so much so, that the preceding paragraph, “It isn’t … this blog … we humans …” would be doubly forbidden because the ‘right’ to mention ‘this blog post’ is limited only to something which is hierarchically above blog posts, ‘metablog-posts’. Secondly, me (presumably a human) belonging to the class ‘we’ can’t mention ‘we’ either.

Since, we humans, love self-reference so much, let’s discuss some ways in which it can be expressed in written form.

One way of making such a strange loop, and perhaps the ‘simplest’ is using the word ‘this’. Here:

• This sentence is made up of eight words.
• This sentence refers to itself, and is therefore useless.
• This blog post is so good.
• This sentence conveys you the meaning of ‘this’.
• This sentence is a lie. (Epimenides Paradox)

Another amusing trick for creating a self-reference without using the word ‘this sentence’ is to quote the sentence inside itself.

Someone may come up with:

The sentence ‘The sentence contains five words’ contains five words.

But, such an attempt must fail, for to quote a finite sentence inside itself would mean that the sentence is smaller than itself. However, infinite sentences can be self-referenced this way.

The sentence
"The sentence
"The sentence
...etc
...etc
is infinitely long"
is infinitely long"
is infinitely long"


There’s a third method as well, which you already saw in the title - the Quine method. The term ‘Quine’ was coined by Douglas Hofstadter in his book “Gödel Escher, Bach” (which heavily inspires this blog post). When using this, the self-reference is ‘generated’ by describing a typographical entity, isomorphic to the quine sentence itself. This description is carried in two parts - one is a set of ‘instructions’ about how to ‘build’ the sentence, and the other, the ‘template’ contains information about the construction materials required.

The Quine version of Epimenides would be:

“yields falsehood when preceded by it’s quotation” yields falsehood when preceded by it’s quotation

Before going on with ‘quining’, let’s take a moment and realize how awfully powerful our cognitive capacities are, and what goes in our head when a cognitive payload full of self-references is delivered - in order to decipher it, we not only need to know the language, but also need to work out the referent of the phrase analogous to ‘this sentence’ in that language. This parsing depends on our complex, yet totally assimilated ability to handle the language.

The idea of referring to itself is quite mind-blowing, and we keep doing it all the time — perhaps, why it feels so ‘easy’ for us to do so. But, we aren’t born that way, we grow that way. This could better be realized by telling someone much younger “This sentence is wrong.”. They’d probably be confused - What sentence is wrong?. The reason why it’s so simple for self-reference to occur, and hence allow paradoxes, in our language, is well, our language. It allows our brain to do the heavy lifting of what the author is trying to get through us, without being verbose.

Back to Quines.

Reproducing itself

Now, that we are aware of how ‘quines’ can manifest as self-reference, it would be interesting to see how the same technique can be used by a computer program to ‘reproduce’ itself.

To make it further interesting, we shall choose the language most apt for the purpose - brainfuck:

>>>>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++[->++++++++<]>--....[-]<<[<]<<++++++++[->+++++>++++++++<<]>+++>-->[[-<<<+>.>>]<.[->+<]<[->+<]>>>]<<<<[<]>[.>]



Running that program above produces itself as the output. I agree, it isn’t the most descriptive program in the world, so written in Python below, is the nearest we can go to describe what’s happening inside those horrible chains of +’s and >’s:

THREE_QUOTES = '"' * 3

def eniuq(template): print(
f'{template}({THREE_QUOTES}{template}{THREE_QUOTES})')

eniuq("""THREE_QUOTES = '"' * 3

def eniuq(template): print(
f'{template}({THREE_QUOTES}{template}{THREE_QUOTES})')

eniuq""")


The first line generates """ on the fly, which marks multiline strings in Python.

Next two lines define the eniuq function, which prints the argument template twice - once, plain and then surrounded with triple quotes.

The last 4 lines cleverly call this function so that the output of the program is the source code itself.

Since we are printing in an order opposite of quining, the name of the function is ‘quine’ reversed -> eniuq (name stolen from Hofstadter again)

Remember the discussion about how self-reference capitalizes on the processor? What if ‘quining’ was a built-in feature of the language, providing what we in programmer lingo call ‘syntactic sugar’?

Let’s assume that an asterisk, * in the brainfuck interpreter would copy the instructions before executing them, what would then be the output of the following program?

*


It’d be an asterisk again. You could make an argument that this is silly, and should be counted as ‘cheating’. But, it’s the same as relying on the processor, like using “this sentence” to refer to this sentence - you rely on your brain to do the inference for you.

What if eniuq was a builtin keyword in Python? A perfect self-rep was then just be a call away:

eniuq('eniuq')


What if quine was a verb in the English language? We could reduce a lot of explicit cognitive processes required for inference. The Epimenides paradox would then be:

“yields falsehood if quined” yields falsehood if quined

Now, that we are talking about self-rep, here’s one last piece of entertainment for you.

The Tupper’s self-referential formula

This formula is defined through an inequality:

$${1 \over 2} < \left\lfloor \mathrm{mod}\left(\left\lfloor {y \over 17} \right\rfloor 2^{-17 \lfloor x \rfloor - \mathrm{mod}(\lfloor y\rfloor, 17)},2\right)\right\rfloor$$

If you take that absurd thing above, and move around in the cartesian plane for the coordinates $$0 \le x \le 106, k \le y \le k + 17$$, where $$k$$ is a 544 digit integer (just hold on with me here), color every pixel black for True, and white otherwise, you'd get:

This doesn't end here. If $$k$$ is now replaced with another integer containing 291 digits, we get yours truly:

TeX User Group Conference 2019, Palo Alto

The Tex User Group 2019 conference was held between August 9-11, 2019 at Sheraton Palo Alto Hotel, in Palo Alto, California.

I wanted to attend TUG 2019 for two main reasons - to present my work on the “XeTeX Book Template”, and also to meet my favourite computer scientist, Prof. Donald Knuth. He does not travel much, so, it was one of those rare opportunities for me to meet him in person. His creation of the TeX computer typesetting system, where you can represent any character mathematically, and also be able to program and transform it is beautiful, powerful and the best typesetting software in the world. I have been using TeX extensively for my documentation and presentations over the years.

Day I

I reached the hotel venue only in the afternoon of Friday, August 9, 2019, as I was also visiting Mountain View/San Jose on official work. I quickly checked into the hotel and completed my conference registration formalities. When I entered the hall, Rishi T from STM Document Engineering Private Limited, Thiruvananthapuram was presenting a talk on “Neptune - a proofing framework for LaTeX authors”. His talk was followed by an excellent poetic narration by Pavneet Arora, who happened to be a Vim user, but, also mentioned that he was eager to listen to my talk on XeTeX and GNU Emacs.

After a short break, Shreevatsa R, shared his experiences on trying to understand the TeX source code, and the lessons learnt in the process. It was a very informative, user experience report on the challenges he faced in navigating and learning the TeX code. Petr Sojka, from Masaryk University, Czech Republic, shared his students’ experience in using TeX with a detailed field report. I then proceeded to give my talk on the “XeTeX Book Template” on creating multi-lingual books using GNU Emacs and XeTeX. It was well received by the audience. The final talk of the day was by Jim Hefferon, who analysed different LaTeX group questions from newbies and in StackExchange, and gave a wonderful summary of what newbies want. He is a professor of Mathematics at Saint Michael’s College, and is well-known for his book on Linear Algebra, prepared using LaTeX. It was good to meet him, as he is also a Free Software contributor.

The TUG Annual General Meeting followed with discussions on how to grow the TeX community, the challenges faced, membership fees, financial reports, and plan for the next TeX user group conference.

Day II

The second day of the conference began with Petr Sojka and Ondřej Sojka presenting on “The unreasonable effectiveness of pattern generation”. They discussed the Czech hyphenation patterns along with a pattern generation case study. This talk was followed by Arthur Reutenauer presenting on “Hyphenation patterns in TeX Live and beyond”. David Fuchs, a student who worked with Prof. Donald Knuth on the TeX project in 1978, then presented on “What six orders of magnitude of space-time buys you”, where he discussed the design trade-offs in TeX implementation between olden days and present day hardware.

After a short break, Tom Rokicki, who was also a student at Stanford and worked with Donald Knuth on TeX, gave an excellent presentation on searching and copying text in PDF documents generated by TeX for Type-3 bitmap fonts. This session was followed by Martin Ruckert’s talk on “The design of the HINT file format”, which is intended as a replacement of the DVI or PDF file format for on-screen reading of TeX output. He has also authored a book on the subject - “HINT: The File Format: Reflowable Output for TeX”. Doug McKenna had implemented an interactive iOS math book with his own TeX interpreter library. This allows you to dynamically interact with the typeset document in a PDF-free ebook format, and also export the same. We then took a group photo:

I then had to go to Stanford, so missed the post-lunch sessions, but, returned for the banquet dinner in the evening. I was able to meet and talk with Prof. Donald E. Knuth in person. Here is a memorable photo!

He was given a few gifts at the dinner, and he stood up and thanked everyone and said that “He stood on the shoulders of giants like Isaac Newton and Albert Einstein.”

< />

I had a chance to meet a number of other people who valued the beauty, precision and usefulness of TeX. Douglas Johnson had come to the conference from Savannah, Georgia and is involved in the publishing industry. Rohit Khare, from Google, who is active in the Representational State Transfer (ReST) community shared his experiences with typesetting. Nathaniel Stemen is a software developer at Overleaf, which is used by a number of university students as an online, collaborative LaTeX editor. Joseph Weening, who was also once a student to Prof. Donald Knuth, and is at present a Research Staff member at the Institute for Defense Analyses Center for Communications Research in La Jolla, California (IDA/CCR-L) shared his experiences in working with the TeX project.

Day III

The final day of the event began with Antoine Bossard talking on “A glance at CJK support with XeTeX and LuaTeX”. He is an Associate Professor of the Graduate School of Science, Kanagawa University, Japan. He has been conducting research regarding Japanese characters and their memorisation. This session was followed by a talk by Jaeyoung Choi on “FreeType MF Module 2: Integration of Metafont and TeX-oriented bitmap fonts inside FreeType”. Jennifer Claudio then presented the challenges in improving Hangul to English translation.

After a short break, Rishi T presented “TeXFolio - a framework to typeset XML documents using TeX”. Boris Veytsman then presented the findings on research done at the College of Information and Computer Science, University of Massachusetts, Amherst on “BibTeX-based dataset generation for training citation parsers”. The last talk before lunch was by Didier Verna on “Quickref: A stress test for Texinfo”. He teaches at École Pour l’Informatique et les Techniques Avancées, and is a maintainer of XEmacs, Gnus and BBDB. He also an avid Lisper and one of the organizers of the European Lisp Symposium!

After lunch, Uwe Ziegenhagen demonstrated on using LaTeX to prepare and automate exams. This was followed by a field report by Yusuke Terada, on how they use TeX to develop a digital exam grading system at large scale in Japan. Chris Rowley, from the LaTeX project, then spoke on “Accessibility in the LaTeX kernel - experiments in tagged PDF”. Ross Moore joined remotely for the final session of the day to present on “LaTeX 508 - creating accessible PDFs”. The videos of both of these last two talks are available online.

A number of TeX books were made available for free for the participants, and I grabbed quite a few, including a LaTeX manual written by Leslie Lamport. Overall, it was a wonderful event, and it was nice to meet so many like-minded Free Software people.

A special thanks to Karl Berry, who put in a lot of effort in organizing the conference, but, could not make it to due to a car accident.

The TeX User Group Conference in 2020 is scheduled to be held at my alma mater, Rochester Institute of Technology.

22

Happy Birthday Dear me!

A panegyric about my mentor, Omar Bhai

I was still up at this unearthly hour, thinking about life for a while now - fumbled thoughts about where I had come, where I started, and quite expectedly, Omar Bhai, your name popped in.

The stream continued. I started thinking about everything I’ve learned from you and was surprised with merely the sheer volume of thoughts that followed. I felt nostalgic!

I made a mental note to type this out the next day.

I wanted to do this when we said our final goodbyes and you left for the States, but thank God, I didn’t - I knew that I would miss you, but never could I have guessed that it would be so overwhelming - I would’ve never written it as passionately as I do today.

For those of you who don’t already know him, here’s a picture:

I’m a little emotional right now, so please bear with me.

You have been warned - the words “thank you” and “thanks” appear irritatingly often below. I tried changing, but none other has quite the same essence.

How do I begin thanking you?

Well, let’s start with this - thank you for kicking me on my behind, albeit civilly, whenever I would speak nuisance (read chauvinism). I can’t thank you enough for that!

I still can’t quite get how you tolerated the bigot I was and managed to be calm and polite. Thank You for teaching me what tolerance is!

Another thing which I learnt from you was what it meant to be privileged. I can no longer see things the way I used to, and this has made a huge difference. Thank You!

I saw you through your bad times and your good. The way you tackled problems, and how easy you made it look. Well, it taught me [drum roll] how to think (before acting and not the other way round). Thank You for that too!

And, thank you for buying me books, and even more so, lending away so many of them! and even more so, educating me about why to read books and how to read them. I love your collection.

You showed all of us, young folks, how powerful effective communication is. Thank You again for that! I know, you never agree on this, but you are one hell of a speaker. I’ve always been a fan of you and your puns.

I wasn’t preparing for the GRE, but I sat in your sessions anyways, just to see you speak. The way you connect with the audience is just brilliant.

For all the advice you gave me on my relationships with people - telling me to back off when I was being toxic and dragging me off when I was on the receiving side - I owe you big time. Thank You!

Also, a hearty thank you for making me taste the best thing ever - yes, fried cheese it is. :D

Thank You for putting your trust and confidence in me!

Thank you for all of this, and much more!

Yours Truly, Rahul

Some pending logs!

September 11, 2019

It’s been a very long time since I wrote here for the last.

The reason is nothing big but mainly because:

1. Apparently, I was not able to finish some tasks in time that I used to write about.
2. I was not well for a long time that could be an another reason .
3. Besides, life happened in many ways which ultimately left me working on some other things first, because they seemed to be *important* for the time.

And, yes, there is no denying the fact that I was procastinating too because writing seems to be really hard at most times.

Though I had worked on many things throughout the time and I’ll try to write them here as short and quick logs below.

• Around the second last week of august, I worked on setting up a self-hosted OpenVPN server which supported client scalability. The infrastructure required two servers/VMs, each having a basic firewall setup and a non-root “sudo” priviliged user. One among them was to host the OpenVPN service and another one was to serve as a Certificate Authority (CA). You can refer the following links to check for the related process.

• In the last week of August, I worked on another task i.e to read about Syslogs and figure out how each of the systems can email the root mails to a certain email address for log collection. Thus, I read about Syslogs, how it works, its format and various Syslogs message levels. The latter part of the task was accomplished using ssmtp as mail program & writing cron jobs to actually send them to the intended email addresses. Check the following links for resources.

This one question always came up, many times, the students managed to destroy their systems by doing random things. rm -rf is always one of the various commands in this regard.

Kushal Das
• While I was doing the above task, at one time I ruined my local system’s mail server configs and actually ended up doing something which kushal writes about in one of his recent post (quoted above). I was using the command rm -rf to clean some of the left-over dependencies of some mail packages, but that eventually resulted into machine being crashed. It was not the end of the mess this time. I made an another extremely big mistake meanwhile. I was trying to back up the crashed system, into an external hard disk using dd. But because I had never used dd before, so again I did something wrong and this time, I ended up losing ~500 GBs of backed up data. This is “the biggest mistake” and “the biggest lesson” I have learnt so far. (now I know why one should have multiple backups) And as there was absolutely no way of getting that much data back, the last thing I did was, formatting the hard-disk into 2 partitions, one with ext4 file system for linux backup and the other one as ntfs for everything else.

Thank you so much jasonbraganza for all the help and extremely useful suggestions during the time.

• Okay, now after all the hassle bustle above, I got something really nice. This time, I received the “Raspberry Pi 4, 4GB, Complete Kit ” from kushal.

Thank you very much kushal for the RPi and an another huge thanks for providing me with all the guidance and support that made me reach to even what I am today.

• During the same time, I attended a dgplug guest session from utkarsh2102. This session gave me a “really” good beginner’s insight of how things actually work in Debian Project. I owe a big thanks to utkarsh2102 as well, for he so nicely voluteered me from there onwards, to actually start with Debian project. I have started with DPMT and have done packaging 4 python modules so far. And now, I am looking forward to start contributing to Debian Ruby Team as well.

• With the start of september, I spent some time solving some basic Python problems from kushal’s lymworkbook. Those issues were related to some really simply sys-admins work. But for me, working around and wrapping them in Python was a whole lot of learning. I hope I will continue to solve some more problems/issues from the lab.

• And lastly (and currently), I am back to reading and implementing concepts from Ops School curriculum.

Voila, finally, I finish compiling up the logs from some last 20 days of work and other stuffs. (and thus, I am eventually finishing my long pending task of writing this post here as well).

I will definitely try to be more consistent with my writing from now onwards.

That’s all for now. o/

Why I prefer SSH for Git?

In my last blog, I quoted

I'm an advocate of using SSH authentication and connecting to services like Github, Gitlab, and many others.

On this, I received a bunch of messages over IRC asking why do I prefer SSH for Git over HTTPS.

I find the Github documentation quite helpful when it comes down to learning the basic operation of using Git and Github. So, what has Github to say about "SSH v/s HTTPS"?

Github earlier used to recommend using SSH, but they later changed it to HTTPS. The reason for the Github's current recommendation could be:

• Easily accessible: HTTPS in comparison to SSH is easily accessible. Why? You may ask. The reason is a lot of times SSH ports are blocked behind a firewall and the only option left for you might be HTTPS. This is a very common scenario I've seen in the Indian colleges and a few IT companies.

Why do I recommend SSH-way?

SSH keys provide Github with a way to trust a computer. For every machine that I have, I maintain a separate set of keys. I upload the public keys to Github or whichever Git-forge I'm using. I also maintain a separate set of keys for the websites. So, for example, if I have 2 machines and I use Github and Pagure then I end up maintaining 4 keys. This is like a 1-to-1 connection of the website and the machine.

SSH is secure until you end up losing your private key. If you do end up losing your key, even then you can just login using your username/password and delete the particular key from Github. I agree, that the attacker can do nasty things but that would be limited to repositories and you would have control of your account to quickly mitigate the problem.

On the other side, if you end up losing your Github username/password to an attacker, you lose everything.

I also once benefitted from using SSH with Github, but IMO, exposing that also exposes a vulnerability so I'll just keep it a secret :)

Also, if you are on a network that has SSH blocked, you can always tunnel it over HTTPS.

But, above all, do use 2-factor authentication that Github provides. It's an extra layer of security to your account.

If you have other thoughts on the topic, do let me know over twitter @yudocaa, or drop me an email.

Photo by Christian Wiediger on Unsplash

Increasing Postgres column name length

This blog is more like a bookmark for me, the solution was scavenged from internet. Recently I have been working on an analytics project where I had to generate pivot transpose tables from the data. Now this is the first time I faced the limitations set on postgres database. Since its a pivot, one of my column would be transposed and used as column names here, this is where things started breaking. Writing to postgres failed with error stating column names are not unique. After some digging I realized Postgres has a column name limitation of 63 bytes and anything more than that will be truncated hence post truncate multiple keys became the same causing this issue.

Next step was to look at the data in my column, it ranged from 20-300 characters long. I checked with redshift and Bigquery they had similar limitations too, 128 bytes. After looking for sometime found a solution, downloaded the postgres source, changed NAMEDATALEN to 301(remember column name length is always NAMEDATALEN – 1) src/include/pg_config_manual.h`, followed the steps from postgres docs to compile the source and install and run postgres. This has been tested on Postgres 9.6 as of now and it works.

Next up I faced issues with maximum number columns, my pivot table had 1968 columns and postgres has a limitation of 1600 total columns. According to this answer I looked into the source comments and that looked quite overwhelming . Also I do not have a control over how many columns will be there post pivot so no matter whatever value i set , in future i might need more columns, so instead I handled the scenario in my application code to split the data across multiple tables and store them.

References:

Subscriptions

Last updated:
November 26, 2020 06:01 AM
All times are UTC.