The little script I wrote in 2020, to get me my daily poetry fix, died a couple of months ago.
Life got in the way of me, figuring out what was wrong.
I finally got around to it today.
A little bit of peeking under the hood revealed that Cloudflare did not like me using the Python Requests library to check the Poetry Foundation page.
Or maybe it just did not like the useragent string that Requests gave it.
I was in no mood to go down that rabbit hole, so I just went looking for modern alternatives to Requests.
Found HTTPX.1 and switched over to it.2
And then while I was at it, replaced PyRSS2Gen with FeedGenerator.
And tada!
Poemfeed lives again!
Find the updated code, over on Github.
Writing code today, made me realise, how far I have come since I wrote that code.
I figured out the issue, I knew just how I wanted it to work, I went out, researched the code, and then wrote and iterated over the code and got it working in a matter of hours. Kushal was right.
Writing lots of shitty code, over and over and over again is the best way to get fluent.
Here’s to more contributing and lots more of writing code!
Yesterday, I came across a tweet by Sara Soueidan, which resonated with me. Mostly because I have had this discussion (or heated arguments) quite a few times with many folks. Please go and read her tweet thread since she mentions some really great points about why progressive enhancement is not anti-js. As someone who cares about security, privacy, and accessibility, I have always been an advocate of progressive enhancement. I always believe that a website (or any web-based solution) should be accessible even without JavaScript in the browser. And more often than not, people take me as someone who is anti-JavaScript. Well, let me explain with the help (a lot of help) of resources already created by other brilliant folks.
What is Progressive Enhancement?
Progressive enhancement is the idea of making a very simple, baseline foundation for a website that is accessible and usable by all users irrespective of their input/output devices, browsers (or user-agents), or the technology they are using. Then, once you have done that, you sprinkle more fancy animations and custom UI on top that might make it look more beautiful for users with the ideal devices.
If you saw the video by Heydon, I am sure you are starting to get some idea. Here I am going to reference another video titled Visual Styling vs. Semantic Meaning, which was created by Manuel Matuzović. I love how, in this video Manuel shares the idea of building first semantically and then visually styling it.
So I think a good way to do progressive enhancement according to me is:
Start with HTML - This is a very good place to start, because not only does this ensure that almost all browsers and user devices can render this, but also it helps you think semantically instead of based on the visual design. That already starts making your website not only good for different browsers, but also for screen reader and assistive technology users.
Add basic layout CSS progressively - This is the step where you start applying visual designs. But only the basic layouts. This progressively enhances the visual look of the website, and also you can add things like better focus styles, etc. Be careful and check caniuse.com to add CSS features that are well supported across most browsers in different versions. Remember what Heydon said? "A basic Layout is not a broken layout".
Add fancy CSS progressively - Add more recent CSS features for layouting and to progressively enhance the visual styling of your website. Here you can add much more newer features that make the design look even more perfect.
Add fancy JavaScript sparkles progressively - If there are animations, and interactions that you would like the user to have that is not possible by HTML & CSS, then start adding your JavaScript at this stage. JavaScript is often necessary for creating accessible custom UIs. So absolutely use when necessary to progressively enhance the experience of your users based on the user-agents they have.
SEE! I told you to add JavaScript! So no, progressive enhancement is not about being anti-JavaScript. It's about progressively adding JavaScript wherever necessary to enhance the features of the website, without blocking the basic content, layout and interactions for non-JavaScript users.
Well, why should I not write everything in JavaScript?
I know it's trendy these days to learn fancy new JavaScript frameworks and write fancy new interactive websites. So many of you at this point must be like, "Why won't we write everything in JavaScript? Maybe you hate JavaScript, that's why you are talking about these random HTML & CSS things. What are those? Is HTML even a programming language?"
Well firstly, I love JavaScript. I have contributed to many JavaScript projects, including jQuery. So no I don't hate JavaScript. But I love to use JavaScript for what JavaScript is supposed to be used for. And in most cases, layouting or loading basic content isn't one of them.
But who are these people who need websites to work without JavaScript?
People who have devices with only older browsers. Remember, buying a new device isn't so easy in every part of the world and sometimes some devices may have user-agents that don't support fancy JavaScript. But they still have the right to read the content of the website.
People who care about their security and privacy. A lot of security and privacy focused people prefer using a browser like Tor Browser with JavaScript disabled to avoid any kind of malicious JavaScript or JavaScript based tracking. Some users even use extensions like NoScript with common browsers (firefox, chrome, etc.) for similar reasons. But just because they care about their security and privacy doesn't mean they shouldn't have access to wesite content.
People with not so great internet. Many parts of the world still don't have access to great internet and rely on 2G connections. Often loading a huge bundled JavaScript framework with all it's sparkles and features takes unrealistically long time. But they should still be able to access content from a website article.
So, yes. It's not about not using JavaScript. It's more about starting without JavaScript, and then adding your bells and whistles with JavaScript. That way people who don't use JavaScript can still access atleast the basic content.
Last week I attended OAuth Security
Workshop at Trondheim, Norway. It
was a 3-day single-track conference, where the first half of the days were
pre-selected talks, and the second parts were unconference talks/side meetings.
This was also my first proper conference after COVID emerged in the world.
Back to the starting line
After many years felt the whole excitement of being a total newbie in something
and suddenly being able to meet all the people behind the ideas. I reached the
conference hotel in the afternoon of day 0 and met the organizers as they were
in the lobby area. That chat went on for a long, and as more and more people
kept checking into the hotel, I realized that it was a kind of reunion for many
of the participants. Though a few of them met at a conference in California
just a week ago, they all were excited to meet again.
To understand how welcoming any community is, just notice how the community
behaves towards new folks. I think the Python community stands high in this
regard. And I am very happy to say the whole OAuth/OIDC/Identity-related
community folks are excellent in this regard. Even though I kept introducing
myself as the new person in this identity land, not even for a single time I
felt unwelcome. I attended OpenID-related working group meetings during the
conference, multiple hallway chats, or talked to people while walking around
the beautiful city. Everyone was happy to explain things in detail to me. Even
though most of the people there have already spent 5-15+ years in the identity
world.
The talks & meetings
What happens in Trondheim, stays in Trondheim.
I generally do not attend many talks at conferences, as they get recorded. But
here, the conference was a single track, and also, there were no recordings.
The first talk was related to formal verification, and this was the first time
I saw those (scary in my mind) maths on the big screen. But, full credit to the
speakers as they explained things in such a way so that even an average
programmer like me understood each step. And after this talk, we jumped into
the world of OAuth/OpenID. One funny thing was whenever someone mentioned some
RFC number, we found the authors inside the meeting room.
In the second half, we had the GNAP master class from Justin Richer. And once
again, the speaker straightforwardly explained such deep technical details so
that everyone in the room could understand it.
Now, in the evening before, a few times, people mentioned that in heated
technical details, many RFC numbers will be thrown at each other, though it was
not that many for me to get too scared :)
I also managed to meet Roland for the first time. We had longer chats about the
status of Python in the identity ecosystem and also about Identity
Python. I took some notes about how we can improve the
usage of Python in this, and I will most probably start writing about those in
the coming weeks.
In multiple talks, researchers & people from the industry pointed out the
mistakes made in the space from the security point of view. Even though, for
many things, we have clear instructions in the SPECs, there is no guarantee
that the implementors will follow them properly, thus causing security gaps.
At the end of day 1, we had a special Organ concert at the beautiful Trondheim
Cathedral. On day 2, we had a special talk, “The Viking Kings of Norway”.
If you let me talk about my experience at the conference, I don’t think I will
stop before 2 hours. It was so much excitement, new information, the whole
feeling of going back into my starting days where I knew nothing much. Every
discussion was full of learning opportunities (all discussions are anyway, but
being a newbie is a different level of excitement) or the sadness of leaving
Anwesha & Py back in Stockholm. This was the first time I was staying away from
them after moving to Sweden.
Just before the conference ended, Aaron Parecki
gave me a surprise gift. I spent time with it during the whole flight back to
Stockholm.
This conference had the best food experience of my life for a conference.
Starting from breakfast to lunch, big snack tables, dinners, or restaurant
foods. In front of me, at least 4 people during the conference said, “oh, it
feels like we are only eating and sometimes talking”.
Another thing I really loved to see is that the two primary
conferenceorganizers are university roommates who are
continuing the friendship and journey in a very beautiful way. After midnight,
standing outside of the hotel and talking about random things about life and
just being able to see two longtime friends excited about similar things, it
felt so nice.
I also want to thank the whole organizing team, including local organizers,
Steinar, and the rest of the team did a
superb job.
select_for_update is the answer if you want to acquire a lock on the row. The lock is only released after the transaction is completed. This is similar to the Select for update statement in the SQL query.
>>> Dealership.objects.select_for_update().get(pk='iamid')
>>> # Here lock is only required on Dealership object
>>> Dealership.objects.select_related('oem').select_for_update(of=('self',))
select_for_update have these four arguments with these default value
– nowait=False
– skiplocked=False
– of=()
– nokey=False
Let's see what these all arguments mean
nowait
Think of the scenario where the lock is already acquired by another query, in this case, you want your query to wait or raise an error, This behavior can be controlled by nowait, If nowait=True we will raise the DatabaseError otherwise it will wait for the lock to be released.
skip_locked
As somewhat name implies, it helps to decide whether to consider a locked row in the evaluated query. If the skip_locked=true locked rows will not be considered.
nowait and skip_locked are mutually exclusive using both together will raise ValueError
of
In select_for_update when the query is evaluated, the lock is also acquired on the select related rows as in the query. If one doesn't wish the same, they can use of where they can specify fields to acquire a lock on
>>> Dealership.objects.select_related('oem').select_for_update(of=('self',))
# Just be sure we don't have any nullable relation with OEM
no_key
This helps you to create a weak lock. This means the other query can create new rows which refer to the locked rows (any reference relationship).
Few more important points to keep in mind select_for_update doesn't allow nullable relations, you have to explicitly exclude these nullable conditions. In auto-commit mode, select_for_update fails with error TransactionManagementError you have to add code in a transaction explicitly. I have struggled around these points :).
Here is all about select_for_update which you require to know to use in your code and to do changes to your database.
I finally got done, moving the blog from Nikola to Hugo today.
I already wrote about why I did it.
These are a few more thoughts about what went into the endeavour; and some colophonesque details.
One, really small hope, is that it will help me learn Go.
The DevOps world that I now seek to enter, speaks Go.
I also, now run two Go programs that are indispensible to me, Hugo and Miniflux.
And being the control freak, that I am, I’d love to tweak them to just how I like it.
Nikola did help learn me Python, after all.
Speaking of Nikola, I’ve used it for close to four years now, until I moved.
I was a refugee from a self hosted Ghost install. 1
I love it to death.
It’s done all I’ve asked of it … and more.
And it’s been fast and stable all these years.
The community (primarily Chris) have been kind and patient and really helpful. Chris and Roberto and the rest of the gang, have built something rugged and enduring.
So other than having something fun to do while I was down in the dumps, why did I move to Hugo?
I like to learn by reading.
So other than practicing Go, the best way to learn it would be to read code.
And what better place to read code, than that of the tool I use to write? 2
Another point, is that I probably outgrew what Nikola gave me out of the box, and customizing it (and learning how to) was something that I just couldn’t quite wrap my head around.
While Hugo is probably the same way and its documentation is not quite as structured as Nikola’s, it is pretty popular and I can get answers to what I want to do, really, really quickly. 3
And Hugo has an awesome book, to learn from.
So, how did the move go?
I wrote a tiny bespoke Python script, 4 to convert all my Nikola front matter to Hugo front matter.
That took me half a day, but it converted 677 posts, that I was dreading, I’d have to do by hand and would take me weeks.
I wrote another bespoke script to convert some gnarly tags that Hugo choked on.
And just like that, my biggest boulder just fell into place.
That left me free to write a couple of what Hugo calls, Shortcodes, to ease my writing.
I started with shortcode (a custom horizontal rule) and ended up with four!5
I customised the base as well as the theme’s config to make the site look like I wanted it; (from its various options.)
And I added some custom CSS, to have custom horizontal rule and footnote styles, change the look of the links in the menu up top, style my subscribe form, and
Included the beautiful font you are reading this in.
In a word, slow, steady, iterative progress until I was done.
And who and what inspired me and who do I owe gratitude to?
Kushal Das, for sharing my love of typography and beating me to a custom font on his website.
Matthew Butterick, for the gorgeousness, that is Valkyrie. The site originally used the beautiful Charter, but while it looked fetching on Windows and MacOS, it would not render as nicely on Linux machines. 7
With Valkyrie, that no longer is the case. I have something beautiful to look at, all day! 8
Robin Williams, for teaching me all about CRAP and everthing I know about type and design.
the aforementioned horizontal rule, two to center images and captions and one for my subscription ask at the bottom of the post. ↩︎
Nearly everything I did here, I cribbed from his blog config↩︎
which ofcourse, is where I spend my entire day. ↩︎
I was hesitant to plonk down the money for Valkyrie. But I forgot all about that, once I saw my words rendered in the font. So, I am chalking this down to an investment. Besides Matthew’s prices are value for money (I’m just a little broke at the moment, hence the quibbling) and his usage license is generous. ↩︎
Anyone who has dealt with <form> tag in HTML might have come across the autocomplete attribute. Most developers just put autocomplete="on" or autocomplete="off" based on whether they want users to be able to autocomplete the form fields or not. But there's much more in the autocomplete attribute than many folks may know.
Browser settings
Most widely used browsers (Firefox, Chrome, Safari, etc.), by default, remember information that is submitted using a form. When the user later tries to fill another form, browsers look at the name or type attribute of the form field, and then offer to autocomplete or autofill based on the saved information from previous form submissions. I am assuming many of you might have experienced these autocompletion suggestions while filling up forms. Some browsers, like Firefox, look at the id attribute and sometimes even the value of the <label> associated with the input field.
Autofill detail tokens
For a long time, the only valid values for the autocomplete attribute were "on" or "off" based on whether the website developer wanted to allow the browser to automatically complete the input. However, in the case of "on", it was left entirely to the browser how they determine which value is expected by the input form. Now, for some time, the autocomplete tag allows for some other values, which are collectively as a group called autofill detail tokens.
These values help tell the browser exactly what the input field expects without needing the browser to guess it. There is a big list of autofill detail tokens. Some of the common ones are "name", "email", "username", "organizations", "country", "cc-number", and so on. Check the WHATWG Standard for autofill detail tokens to understand what are the valid values, and how they are determined.
There are two different autofill detail tokens associated with passwords which have some interesting features apart from the autocompletion:
"new-password" - This is supposed to be used for "new password field"s or for "confirm new password field"s. This helps separate a current password field from a new password field. Most browsers and most password managers, when they see this in autocomplete attribute, will avoid accidentaly filling existing passwords. Some even suggest a new randomly generated password for the field if autocomplete has "new-password" value.
"current-password" - This is used by browsers and password managers to autofill or suggest autocompletion with the current saved password for that email/username for that website.
The above two tokens really help in intentionally separating new password fields from login password fields. Otherwise, the browsers and password managers don't have much to separate between the two different fields and may guess it wrong.
Privacy concerns
Now, all of the above points might already be giving privacy and security nightmares to many of you. Firstly, the above scenario works only if you are on the same computer, using the same accounts, and the same browsers. But there are a few things you can do to avoid autocompletion, or saving of the data when filling up the form.
Use the browser in privacy/incognito mode. Most browsers will not save the form data submitted to a website when opened in incognito mode. They will, however still suggest autocompletion, based on the saved information from normal mode.
If you already have autocomplete information saved from before, but want to remove now, you can. Most browsers allow you to clear form and search history from the browser.
If you want to disable autofill and autocomplete, you can do that as well from browser settings. This will also tell the browsers to never remember the values entered into the form fields.
You can find the related information for different browsers here:
Safari: You need to uncheck the different options mentioned here to disable
Now, if you are a privacy-focused developer like me, you might be wondering, "Can't I as a developer help protect privacy?". Yes, we can! That's exactly what autocomplete="off" is still there for. We can still add that attribute to an entire <form> which will disable both remembering and autocompletion of all form data in that form. We can also add autocomplete="off" individually to specific <input>, <textarea>, <select> to disabled the remembering and autocompletion of specific fields instead of the entire form.
PS: Even with autocomplete="off", most browsers still offer to remember username and password. This is actually done for the same reason digital security trainers ask to use password managers: so that users don't use the same simple passwords everywhere because they have to remember. As a digital security trainer, I would still recommend not using your browser's save password feature, and instead using a password manager. The password managers actually follow the same rule of remembering and auto-filling even with autocomplete="off" for username and password fields.
Accessibility
So, as a privacy-focused developer, you might be wondering, "Well, I should just use autocomplete="off" in every <form> I write from today". Well, that raises some huge accessibility concerns. If you love standards, then look specifically at Understanding Success Criterion 1.3.5: Identify Input Purpose.
There are folks with different disabilities who really benefit from the autocomplete tag, which makes it super important for accessibility:
People with disabilities related to memory, language or decision-making benefit immensely from the auto-filling of data and not need to remember every time to fill up a form
People with disabilities who prefer images/icons for communication can use assistive technology to add icons associated with various different input fields. A lot of them can benefit from proper autocomplete values, if the name attribute is not eligible.
People with motor disabilities benefit from not needing to manually input forms every time.
So, given that almost all browsers have settings to disable these features, it might be okay to not always use autocomplete="off". But, if there are fields that are essentially super sensitive, that you would never want the browser to save information about (e.g., government id, one time pin, credit card security code, etc.), you should use autocomplete="off" on the individual fields instead of the entire <form>. Even if you really really think that the entire form is super sensitive and you need to apply autocomplete="off" on the entire <form> element to protect your user's privacy, you should still at least use autofill detail tokens for the individual fields. This will ensure that the browser doesn't remember the data entered or suggest autofills, but will still help assistive technologies to programmatically determine the purpose of the fields.
A network is a group of computers and computing devices connected together through communication channels, such as cables or wireless media. The computers connected over a network may be located in the same geographical area or spread across the world. The Internet is the largest network in the world and can be called "the network of networks".
ip address
Devices attached to a network must have at least one unique network address identifier known as the IP (Internet Protocol) address. The address is essential for routing packets of information through the network.
Exchanging information across the network requires using streams of small packets, each of which contains a piece of the information going from one machine to another. These packets contain data buffers, together with headers which contain information about where the packet is going to and coming from, and where it fits in the sequence of packets that constitute the stream. Networking protocols and software are rather complicated due to the diversity of machines and operating systems they must deal with, as well as the fact that even very old standards must be supported.
IPV4 and IPV6
There are two different types of IP addresses available: IPv4 (version 4) and IPv6 (version 6). IPv4 is older and by far the more widely used, while IPv6 is newer and is designed to get past limitations inherent in the older standard and furnish many more possible addresses.
IPv4 uses 32-bits for addresses; there are only 4.3 billion unique addresses available. Furthermore, many addresses are allotted and reserved, but not actually used. IPv4 is considered inadequate for meeting future needs because the number of devices available on the global network has increased enormously in recent years.
IPv6 uses 128-bits for addresses; this allows for 3.4 X 1038 unique addresses. If you have a larger network of computers and want to add more, you may want to move to IPv6, because it provides more unique addresses. However, it can be complex to migrate to IPv6; the two protocols do not always inter-operate well. Thus, moving equipment and addresses to IPv6 requires significant effort and has not been quite as fast as was originally intended. We will discuss IPv4 more than IPv6 as you are more likely to deal with it.
One reason IPv4 has not disappeared is there are ways to effectively make many more addresses available by methods such as NAT (Network Address Translation). NAT enables sharing one IP address among many locally connected computers, each of which has a unique address only seen on the local network. While this is used in organizational settings, it also used in simple home networks. For example, if you have a router hooked up to your Internet Provider (such as a cable system) it gives you one externally visible address, but issues each device in your home an individual local address.
decoding IPv4
A 32-bit IPv4 address is divided into four 8-bit sections called octets.
Example:
IP address → 172 . 16 . 31 . 46
Bit format → 10101100.00010000.00011111.00101110
NOTE: Octet is just another word for byte.
Network addresses are divided into five classes: A, B, C, D and E. Classes A, B and C are classified into two parts: Network addresses (Net ID) and Host address (Host ID). The Net ID is used to identify the network, while the Host ID is used to identify a host in the network. Class D is used for special multicast applications (information is broadcast to multiple computers simultaneously) and Class E is reserved for future use.
Class A network address – Class A addresses use the first octet of an IP address as their Net ID and use the other three octets as the Host ID. The first bit of the first octet is always set to zero. So you can use only 7-bits for unique network numbers. As a result, there are a maximum of 126 Class A networks available (the addresses 0000000 and 1111111 are reserved). Not surprisingly, this was only feasible when there were very few unique networks with large numbers of hosts. As the use of the Internet expanded, Classes B and C were added in order to accommodate the growing demand for independent networks.
Each Class A network can have up to 16.7 million unique hosts on its network. The range of host address is from 1.0.0.0 to 127.255.255.255.
Class b network address – Class B addresses use the first two octets of the IP address as their Net ID and the last two octets as the Host ID. The first two bits of the first octet are always set to binary 10, so there are a maximum of 16,384 (14-bits) Class B networks. The first octet of a Class B address has values from 128 to 191. The introduction of Class B networks expanded the number of networks but it soon became clear that a further level would be needed.
Each Class B network can support a maximum of 65,536 unique hosts on its network. The range of host addresses is from 128.0.0.0 to 191.255.255.255.
Class C network address – Class C addresses use the first three octets of the IP address as their Net ID and the last octet as their Host ID. The first three bits of the first octet are set to binary 110, so almost 2.1 million (21-bits) Class C networks are available. The first octet of a Class C address has values from 192 to 223. These are most common for smaller networks which don’t have many unique hosts.
Each Class C network can support up to 256 (8-bits) unique hosts. The range of host addresses is from 192.0.0.0 to 223.255.255.255.
what is a name resolution?
Name Resolution is used to convert numerical IP address values into a human-readable format known as the hostname. For example, 104.95.85.15 is the numerical IP address that refers to the hostname whitehouse.gov. Hostnames are much easier to remember!
Given an IP address, one can obtain its corresponding hostname. Accessing the machine over the network becomes easier when one can type the hostname instead of the IP address.
Then comes the network configuration files and these are essential to ensure that interfaces function correctly. They are located in the /etc directory tree. However, the exact files used have historically been dependent on the particular Linux distribution and version being used.
For Debian family configurations, the basic network configuration files could be found under /etc/network/, while for Red Hat and SUSE family systems one needed to inspect /etc/sysconfig/network.
Network interfaces are a connection channel between a device and a network. Physically, network interfaces can proceed through a network interface card (NIC), or can be more abstractly implemented as software. You can have multiple network interfaces operating at once. Specific interfaces can be brought up (activated) or brought down (de-activated) at any time.
A network requires the connection of many nodes. Data moves from source to destination by passing through a series of routers and potentially across multiple networks. Servers maintain routing tables containing the addresses of each node in the network. The IP routing protocols enable routers to build up a forwarding table that correlates final destinations with the next hop addresses.
Let’s learn about more networking tools like wget and curl.
Sometimes, you need to download files and information, but a browser is not the best choice, either because you want to download multiple files and/or directories, or you want to perform the action from a command line or a script. wget is a command line utility that can capably handle these kinds of downloads. Whereas curl is used to obtain information about any specific URL.
File Transfer Protocol(FTP)
File Transfer Protocol (FTP) is a well-known and popular method for transferring files between computers using the Internet. This method is built on a client-server model. FTP can be used within a browser or with stand-alone client programs. FTP is one of the oldest methods of network data transfer, dating back to the early 1970s.
Secure Shell(SSH)
Secure Shell (SSH) is a cryptographic network protocol used for secure data communication. It is also used for remote services and other secure services between two devices on the network and is very useful for administering systems which are not easily available to physically work on, but to which you have remote access.
The VirtualHost directive in Apache configuration enables us to run multiple websites on a single server. I wanted to have two different VirtualHost entries for the same domain on different IPv4 and IPv6 addresses. I am using adas.example.org as the domain name in the example mentioned here.
I was doing the task via ansible for multiple servers. However, for every server, the IP addresses are different. Therefore, I can not hard code them in a configuration file. So, I tried to use 0.0.0.0 for IPv4 [::] for IPv6. Now the configurations look like the following :
After a successful ansible run, I encounter the problem that Apache is accessing only the topmost virtual host entry. Apache counts all virtual host entries as the same, one for port 80 and one for 443. apachectl -S is a handy command to see the VirtualHost settings as parsed from the config file.
The apachectl -S command shows this :
VirtualHost configuration:
*:80 is a NameVirtualHost
default server adas.example.org (/etc/apache2/sites-enabled/vhost.conf:1)
port 80 namevhost adas.example.org (/etc/apache2/sites-enabled/vhost.conf:1)
port 80 namevhost adas.example.org (/etc/apache2/sites-enabled/vhost.conf:13)
*:443 is a NameVirtualHost
default server adas.example.org (/etc/apache2/sites-enabled/vhost.conf:7)
port 443 namevhost adas.example.org (/etc/apache2/sites-enabled/vhost.conf:7)
port 443 namevhost adas.example.org (/etc/apache2/sites-enabled/vhost.conf:19)
Solution
We can not have 0.0.0.0 and [::] in the Virtual entries in the particular use case. 0.0.0.0 is the IP address mentioned in the first VirtualHost abovementioned apache configuration. The Apache process takes the first entry as 0.0.0.0, which means all IPv4 addresses.
We have to use the exact IP addresses (both for IPv4 and IPv6). The output of the apachectl -S shows four different entries correctly as it is meant to.
To automate this via ansible we have to use ansible_default_ipv4.address and ansible_default_ipv6.address for getting the ipv4 and ipv6 address.
In finding the solution to the problem, I asked for help in the #httpd IRC channel. I want to thank the httpd community for helping and showing the direction to the solution.
It is has been a few months since I have moved to Stockholm. Since we were planning to move in here, the first thing I searched for was the PyLadies Stockholm. Within days of arriving, I attended my first PyLadies meetup in November 2021. What a vibrant group it is! The best part was that I met a group of strangers in a new city, but I felt at home.
I want to give a huge shout-out to the organizers of PyLadies Stockholm. They have kept the group active and alive (and how) during the adverse situation of the global pandemic.
April 6th, 2022, gave me my second chance to meet these amazing women. We had planned this meetup to be informal. The meetup was designed to get to know our fellow PyLadies in person. And most importantly, we, the organizers, understand their expectations from the PyLadies Stockholm. Since the group is planning to have more in-person meetups post the pandemic, the answers to this would give us clarity to pave the future path for the group.
The ladies attending the meetup came from diverse backgrounds, from medical, physics, law, computer science, etc. It was such a wonderful experience for me, personally. Post the introduction sessions; we discussed our journey. Why and how we learned Python. The materials that kept coming in the discussion were :
Automate the Boring Stuff with Python by Al Sweigart
Beyond the Basic Stuff with Python by Al Sweigart.
One attendee, in particular, explained how one of the online sessions from PyLadies Stockholm benefited her learning process. Further, we shared some project ideas and old projects we worked on with Python.
We, as a group, decided that :
We want to focus on topics for meetups catering to both Python beginners and experienced.
We want to go for both online and offline events. While online events will widen our reach, offline events will allow us to be in the same room as each other and work on a project.
We want to give space to first-time PyLadies speakers at our events.
I want to thank my fellow co-organizer for guiding me through the meetup and doing all the heavy lifting :).
So you can expect some exciting beginner-friendly sessions by our PyLadies members and some advanced workshops in the coming months. Between us we will continue having “Book Club” sessions online. We are reading “Atlas of AI Power, Politics and the Planetary Costs of Artificial Intelligence” by Kate Crawford
). We have our next session on April 27th, 2022, from 17:30 to 18:45 CEST. You can sign up for the event in here. If you want to be updated about our events, know what is happening in the Python community in Sweden follow us on twitter, join our Slack channel #city-stockholm and follow our meetup page
See you all there.
There are few command line tools which we can use to parse text files. This helps us in day to day basis if you are using linux and it becomes essential for a Linux user to adept at performing certain operations on file.
cat
cat is short for concatenate and is one of the most frequently used Linux command line utilities. It is often used to read and print files, as well as for simply viewing file contents. To view a file, use the following command:
$ cat
For example, cat readme.txt will display the contents of readme.txt on the terminal. However, the main purpose of cat is often to combine (concatenate) multiple files together. The tac command (cat spelled backwards) prints the lines of a file in reverse order. Each line remains the same, but the order of lines is inverted.
cat can be used to read from standard input (such as the terminal window) if no files are specified. You can use the > operator to create and add lines into a new file, and the >> operator to append lines (or files) to an existing file. We mentioned this when talking about how to create files without an editor.
To create a new file, at the command prompt type cat > and press the Enter key. This command creates a new file and waits for the user to edit/enter the text, after editing type in CTRL-D at the beginning of the next line to save and exit the editing.
echo
echo simply displays (echoes) text. It is used simply, as in:
$ echo string
echo can be used to display a string on standard output (i.e. the terminal) or to place in a new file (using the > operator) or append to an already existing file (using the >> operator).
The –e option, along with the following switches, is used to enable special character sequences, such as the newline character or horizontal tab:
\n represents newline
\t represents horizontal tab.
echo is particularly useful for viewing the values of environment variables (built-in shell variables). For example, echo $USERNAME will print the name of the user who has logged into the current terminal.
how to work with large files?
System administrators need to work with configuration files, text files, documentation files, and log files. Some of these files may be large or become quite large as they accumulate data with time. These files will require both viewing and administrative updating.
For example, a banking system might maintain one simple large log file to record details of all of one day’s ATM transactions. Due to a security attack or a malfunction, the administrator might be forced to check for some data by navigating within the file. In such cases, directly opening the file in an editor will cause issues, due to high memory utilization, as an editor will usually try to read the whole file into memory first. However, one can use less to view the contents of such a large file, scrolling up and down page by page, without the system having to place the entire file in memory before starting. This is much faster than using a text editor.
head reads the first few lines of each named file (10 by default) and displays it on standard output. You can give a different number of lines in an option.
For example, if you want to print the first 5 lines from /etc/default/grub, use the following command:
$ head –n 5 /etc/default/grub
tail prints the last few lines of each named file and displays it on standard output. By default, it displays the last 10 lines. You can give a different number of lines as an option. tail is especially useful when you are troubleshooting any issue using log files, as one can probably want to see the most recent lines of output.
For example, to display the last 15 lines of somefile.log, use the following command:
$ tail -n 15 somefile.log
to view compressed files
When working with compressed files, many standard commands cannot be used directly. For many commonly-used file and text manipulation programs, there is also a version especially designed to work directly with compressed files. These associated utilities have the letter "z" prefixed to their name. For example, we have utility programs such as zcat, zless, zdiff and zgrep.
managing your files
Linux provides numerous file manipulation utilities that you can use while working with text files.
sort – is used to rearrange the lines of a text file, in either ascending or descending order according to a sort key. The default sort key is the order of the ASCII characters (i.e. essentially alphabetically).
uniq – removes duplicate consecutive lines in a text file and is useful for simplifying the text display.
paste – can be used to create a single file containing all three columns. The different columns are identified based on delimiters (spacing used to separate two fields). For example, delimiters can be a blank space, a tab, or an Enter.
split – is used to break up (or split) a file into equal-sized segments for easier viewing and manipulation, and is generally used only on relatively large files. By default, split breaks up a file into 1000-line segments. The original file remains unchanged, and a set of new files with the same name plus an added prefix is created. By default, the x prefix is added. To split a file into segments, use the command split infile.
Regular expressions are text strings used for matching a specific pattern, or to search for a specific location, such as the start or end of a line or a word. Regular expressions can contain both normal characters or so-called meta-characters, such as * and $.
grep is extensively used as a primary text searching tool. It scans files for specified patterns and can be used with regular expressions, as well as simple strings
In my last blog post I talked about
verybad web application. It has
multiple major security holes, which allows anyone to do remote code execution
or read/write files on a server. Look at the source code to see what all you
can do.
I am running one instance in public
http://verybad.kushaldas.in:8000/, and
then I asked twitter to see if anyone can get access. Only
difference is that this service has some of the latest security mitigation
from systemd on a Fedora
35 box.
The service is up for a few days now, a few people tried for hours. One person
managed to read the verybad.service file after a few hours of different
tries. This allowed me to look into other available options from systemd.
Rest of the major protections are coming from DynamicUser=yes configuration
in systemd. This enables multiple other protections (which can not be turned
off). Like:
SUID/SGID files can not be created or executed
Temporary filesystem is private to the service
The entire file system hierarchy is mounted read-only except a few places
systemd can also block exec mapping of shared libraries or executables. This
way we can block any random command execution, but still allow the date
command to execute.
Please have a look at the man
page and
learn about many options systemd now provides. I am finding this very useful as
it takes such small amount of time to learn and use. The credit goes to
Lennart and rest of the
maintainers.
Oh, just in case you are wondering, for a real service you should enable this
along with other existing mechanisms, like SELinux or AppArmor.
Previously, I wrote about the first revision of our RepRap machine based on Prusa i3 printer. This is a project which I have been working with my younger brother. I will be talking about the enhancements, issues, and learnings from the second build of the printer.
3D printed printer parts As soon as we got the first build of the printer working, we started printing printer parts. Basically, the idea is to replace the wooden parts with 3D printed parts which have way better precision.
It has been ridiculously hard and I was about to pull my hair out at several
points on this ordeal, but I finally managed to remove this stupid (revised to
modest language here) DRM from my epub file. I took many side roads that lead me
nowhere and spent quite some hours to find a working solution, so I'm writing
this down in case I (or someone else) may need it again.
Why removing DRM from an ebook
First of all: I'm not saying you should do anything illegal (I'm also not saying
that you shouldn't); however, a friend bought me an ebook as a gift. I told him
about this book and that it was on top of my to-read pile, I just didn't buy it
yet. So he purchased and gifted to me a digital copy and asked if I would lend
it to him when done reading. Sure thing. Except that I couldn't. Bummer!
What I downloaded from the store was an acsm (for Adobe Content Server Message)
file, which is needed to communicate with the Adobe Servers. It verifies that
I'm authorized to download the (DRM protected) book, and to view the content on
different devices that also use my Adobe ID.
A quick internet search...
revealed the following path that lay before me:
install and authorize Adobe Digital Editions
load the acsm in ADE to download the book
export the book as DRM protected epub
use calibre and the DeDRM plugin to remove the DRM protection
So far that sounded perfectly doable. Then I learned that Adobe Digital Editions
is available for Windows, MacOS, Android ... of course there's no Linux app. I
don't have a Windows or Apple machine, and I hate fiddling around with the
phone, so wine it is.
However, on my main machine I also don't have wine - or rather I don't want to,
because I hate to activate the multilib repositories. Wine is still all lib32.
So I used another Ubuntu laptop that I keep for the dirty work of that kind.
Unfortunately (for my mental health) my Ubuntu distribution offers a calibre
version 4.9x, while the latest DeDRM plugin requires calibre 5.x. Fine!, I
think, I just use the latest DeDRM release that plays along with my calibre 4.9.
Well, of course not! That would require calibre running on Python 2; mine runs on
Python 3. There, I loose my hair.
So I end up installing the latest version from
calibre-ebook.com. Should you have to do this, make sure to pick
the "isolated" install, that does not require root.
This worked for me
We start with a fresh wine prefix to install our ADE:
Evoking winecfg will initialize the new prefix. Make sure to pick Windows 10
here - I'll tell you why in a bit.
When I started ADE and tried to open the acsm directly, it crashed. So I had to
first manually authorize the laptop via the Help menu. An Adobe ID is required,
create one if you don't have it already. Then we can load the acsm and get the
DRM protected ebook. I could find the epub files in a new ADE sub-folder in my
~/Documents, but you can also save them to a different location in ADE.
To use calibre's DeDRM plugin to remove the protection,
we need to extract the Adobe ID key from ADE, so make sure to have both
installed. The DeDRM plugin is nice enough to offer a field for our wine prefix.
However, this also means that a wine Python is going to be necessary. This is
what you need to pick Windows 10 for in winecfg: Python 3.9 or 3.10 can easily
be downloaded from https://www.python.org/, but installers only
work on Windows 8.1 or higher (and nobody wants to use 8.1). Remember to pick
the 32 bit version; it's still meant for the wine environment.
So we install Python 3.10 and the necessary dependencies to make the scripts work:
Set the check on "Add Python to Path" during the installation of Python, to
spare you some headaches
The scripts also require an OpenSSL distribution in your wine environment (I
failed to install PyCrypto, the other viable option. Don't ask me why, because I
don't know.). I did find a working package here. Make sure to
pick version 1.1; this is what the DeDRM scripts use (and again choose the 32
bit variant, of course).
Having this set up, you should be able to add your Adobe ID key to the DeDRM
plugin in the plugin's preferences dialog.
I actually didn't know that I would need all this, but after enough cursing I
finally remembered to start calibre with calibre-debug -g, to actually learn
why that stupid (kidding, it's great) script failed. Actually in the end I just
located the python script to extract the ADE keys and ran it manually:
cd ~/.config/calibre/plugins/DeDRM/libraryfiles/
wine python adobekey.py
And this is where you have deserved a bottle of well chilled beer. Whether you
managed to import the key to the DeDRM plugin directly, or manually extracted
them with running adobekey.py for later file import; Now your plugin should be
armed and ready to have your book(s) added to calibre, and DRM protection should
be removed on import.
I also tried...
a tool called knock that promised to remove DRM protection, with
neither wine nor Adobe Digital Editions (ADE) required. I saw very positive
comments so I assume it can work somehow, but apparently I was too dumb to use
it. It probably would have been easier to install via nix, but that's
completely unknown territory for me, so I tried to install all dependencies
manually in a virtual environment, and as with many click apps, it was a pain
to use it in any way that differs the original intention.
The binary packaged release also failed for me, but maybe you are smarter than I
was.
EmacsConf 2021 happened in November last year. Same as last two years, it was an online conference. Thanks to all the volunteers and organizers, it was a great experience.
EmacsConf is the conference about the joy of Emacs, Emacs Lisp, and memorizing key sequences.
— EmacsConf website.
It was a 2 days conference with 45 talks in total. Despite being a thanksgiving weekend, the peak count of attendees was around 300.
I want to spend more time in 2022 to “Do Less!”, take things as they come, take breaks and try to travel or work on mechanical keyboards during those breaks. The time off that I took in 2021 - first half of August and last half of December, gave me a lot of breathing space and time to rethink priorities especially the first one where I partially overcame burnout due to several factors (Thanks to VMware for the generous leaves!).
Last year, I started actively taking care of my health. I jogged for ~700kms in the last quarter (Oct-Dec) although I did not jog during the time I was travelling to Bangalore/Delhi/Kolkata. I want to continue the same trend and target atleast 3000kms of jogging and a 10km run in 2022.
2021 was also when I moved back to my hometown, Agartala, and that too after a span of 9 years. I spent a lot of time with close family members and friends from school. I plan to spend more time with people who I care about and who care about me, be it in Bangalore or Agartala.
Perfectly timed! I had a Raspberry Pi 4 lying around and had just ordered a few more to set up a home lab during the holidays. The newer Pis are yet to arrive, so better utilize the time writing a walkthrough on how to use Flatcar Container Linux on your Pis.
Hardware Requirements
Goes without saying, a Raspberry Pi 4
Form of storage, either USB and/or SD card. USB 3.0 drive recommended because of the much better performance for the price.
⚠️ WARNING ⚠️
The UEFI firmware used in this guide is an UNOFFICIAL firmware. There is a possibility of damage caused due to the usage of this firmware.
The author of this article would not be liable for any damage caused. Please follow this article at your own risk.
Update the EEPROM
The Raspberry PI 4 use an EEPROM to boot the system. Before proceeding ahead, it is recommended to update the EEPROM. Raspberry Pi OS automatically updates the bootloader on system boot. In case you are using Raspberry Pi OS already, then the bootloader may be already updated.
For manually updating the EEPROM, you can either use the Raspberry Pi Imager or the raspi-config. The former is the recommended method in the Raspberry Pi documentation.
We will also see later how the RPi4 UEFI firmware needs a recent version of EEPROM.
Using the Raspberry Pi Imager (Recommended)
Install the Raspberry Pi Imager software. You can also look for the software in your distribution repository.
Being a Fedora user I installed the software using dnf
dnf install rpi-imager
Launch Raspberry Pi Imager.
Select Misc utility images under Operating System.
Select Bootloader.
Select the boot-mode, SD, USB
Select the appropriate storage, SD or USB
Boot the Raspberry Pi with the new image and wait for at least 10 seconds.
The green activity LED will blink with a steady pattern and the HDMI display will be green on success.
Power off the Raspberry Pi and disconnect the storage.
# The update is pulled from the `default` release channel.# The other available channels are: latest and beta# You can update the channel by updating the value of# `FIRMWARE_RELEASE_STATUS` in the `/etc/default/rpi-eeprom-update`# file. This is useful usually in case when you want# features yet to be made available on the default channel.# Install the update
sudo rpi-eeprom-update -a
# A reboot is needed to apply the update# To cancel the update, you can use: sudo rpi-eeprom-update -r
sudo reboot
Installing Flatcar
Install flatcar-install script
Flatcar provides a simple installer script that helps install Flatcar Container Linux on the target disk. The script is available on Github, and the first step would be to install the script in the host system.
mkdir -p ~/.local/bin
# You may also add `PATH` export to your shell profile, i.e bashrc, zshrc etc.
export PATH=$PATH:$HOME/.local/bin
curl -LO https://raw.githubusercontent.com/flatcar-linux/init/flatcar-master/bin/flatcar-install
chmod +x flatcar-install
mv flatcar-install ~/.local/bin
Install Flatcar on the target device
Now that we have the flatcar-install installed in our host machine. We would go ahead and install the Flatcar Container Linux image on the target device.
The target device could be a USB or SD Card. In my case, I reused the existing SD Card which I used in the previous steps. You can use a separate storage device as well.
The options that we will be using with the scripts are:
# -d DEVICE Install Flatcar Container Linux to the given device.# -C CHANNEL Release channel to use# -B BOARD Flatcar Container Linux Board to use# -o OEM OEM type to install (e.g. ami), using flatcar_production_<OEM>_image.bin.bz2# -i IGNITION Insert an Ignition config to be executed on boot.
The device would be the target device that you would like to use. You can use the lsblk command to find the appropriate disk. Here, I’m using /dev/sda which was in my case.
With the given values of channel and board, the script would download the image, verify it with gpg, and then copy it bit for bit to disk.
In our case, Flatcar does not yet ship Raspberry PI specific OEM images yet so the value will be an empty string ''.
Pass the Ignition file, config.json in my case, to provision the Pi during boot.
rpi-uefi community ships a SBBR-compliant(UEFI+ACPI), ArmServerReady ARM64 firmware for Raspberry Pi 4. We would be using the same to UEFI boot Flatcar.
v1.17 of the pftf/RPi4 introduced two major changes:
Firstly, it enabled firmware boot directly from the USB. This is particularly helpful if you are using the installation process using a USB device. To add a fun story, I dropped my Pi and broke the SD card slot. Until the Pi gets repaired, I’m making use of direct USB boot 😎
Secondly, support for directly placing the Pi boot files into the EFI System Partition (ESP). This feature was not implemented in the firmware, rather from the upstream firmware from Raspberry Pi Foundation. This is why it is recommended to update the Pi EEPROM at the very beginning.
Let’s move ahead with the final steps.
Place the UEFI firmware into the EFI System Partition.
Today, we are going to see how we can use | operator in our python code to achieve clean code.
Here is the code where we have used map and filter for a specific operation.
In [1]: arr = [11, 12, 14, 15, 18]
In [2]: list(map(lambda x: x * 2, filter(lambda x: x%2 ==1, arr)))
Out[2]: [22, 30]
The same code with Pipes.
In [1]: from pipe import select, where
In [2]: arr = [11, 12, 14, 15, 18]
In [3]: list(arr | where (lambda x: x%2 ==1) | select(lambda x:x *2))
Out[3]: [22, 30]
Pipes passes the result of one function to another function, have inbuilt pipes method like select, where, tee, traverse.
Install Pipe
>> pip install pipe
traverse
Recursively unfold iterable:
In [12]: arr = [[1,2,3], [3,4,[56]]]
In [13]: list(arr | traverse)
Out[13]: [1, 2, 3, 3, 4, 56]
select()
An alias for map().
In [1]: arr = [11, 12, 14, 15, 18]
In [2]: list(filter(lambda x: x%2 ==1, arr))
Out[2]: [11, 15]
where()
Only yields the matching items of the given iterable:
In [1]: arr = [11, 12, 14, 15, 18]
In [2]: list(arr | where(lambda x: x % 2 == 0))
Out[2]: [12, 14, 18]
sort()
Like Python's built-in “sorted” primitive. Allows cmp (Python 2.x
only), key, and reverse arguments. By default, sorts using the
identity function as the key.
That's all for today, In this blog you have seen how to install the Pipe and use the Pipe to write clean and short code using inbuilt pipes, you can check more over here
I have had the pleasure to talk with 30+ folks and help them in their journey in the field of computer science and/or growing in their career with Open Source Software. It has been an honour that so many wanted to talk to me and get my views.
For the month of December, I am going to take a break from the mentoring sessions as I will travelling on most weekends and will be out for vacation in the later half of the month.
Fret not, I will try to make up for the lost time by doubling up my commitment for January 2022. But, in case you need to urgently talk with me, drop me a ping on hey [at] nabarun [dot] dev and I will try to schedule something which works for both of us.
Wish you all a very happy December! 🎉
PS: Stay tuned to the RSS feed! There are many articles which are languishing in my drafts, I may publish a few of them.
Martin M. Broadwell defines four stages of competence in Teaching for Learning:
unconcious imcompetence
concious incompetence
concious competence
unconcious competence.
Specifically, unconcious incompetence means you are unable to perform a task correctly and are unaware of the gap.
Concious incompetence means you are unable to perform a task correctly but are aware of the gap.
Concious competence means you are capable of performing a task with effort.
Finally, unconcious competence means you are capable of performing a task effortlessly.
All engineers start out conciously or unconciously incompetent. Even if you know everything about software engineering (an impossible task), you’re going to have to learn practical skills like those covered in this book. Your goal is to get to concious competence as quickly as possible.
Cunningham’s Law And Bike-Shedding
We advise you to document, conventions, onboarding procedures, and other oral traditions on your team. You will get a lot of comments and corrections. Do not take the comments Personally.
The point is not to write a perfect document but rather to write enough to trigger a discussion that flashes out the details. This is a variation of Cunningham’s law, which states that “the best way to get the right answer on the internt is not to ask a question; it’s to post the wrong answer.”
Be prepared for trivial discussions to become drawn out, a phenomenon called bike-shedding. Bike-shedding is an allegory by C. Northcote Parkinson, describing a committee assigned to review designs for a power plant. The committee approves the plans within minutes, as they are too complex to actually discuss. They then spend 45 minutes discussing the materials for the bike shed next to the plant. Bike-shedding comes up a lot in technical work.
Mistakes are unavoidable (Learn by Doing!)
At one of Chris’s first internships, he was working on a project with a senior engineer. Chris finished some changes and needed to get them deployed. The senior engineer showed him how to check code into the revision control system, CVS. Chris followed the instructions, blindly running through steps that involved branching, tagging, and merging. Afterward, he continued with the rest of his day and went home.
The next morning, Chris strolled in cheerfully and greeted everyone. They did their best to respond in kind, but they were low. When Chris asked what was up, they informed him that he had managed to corrupt the entire CVS repository. All of the company’s code had been lost. They had been up the entire night desperately trying to recover what they could and were eventually able to get most of the code back (except for Chris’s commits and a few others).
Chris was pretty shaken by the whole thing. His manager pulled him aside and told him not to worry: Chris had done the right thing working with the senior engineer.
Mistakes happen. Every engineer has some version of a story like this. Do your best, and try to understand what you’re doing, but know that these things happen.
I’m grateful for this beautiful life 🍀 , for my parents 👨👩👧👦 , and for every other thing that I’ve received as opportunities 🌟 & lessons 📝 & experiences 📈 from my life (so far), and the very kind & generous people 🧑🤝🧑 I’ve known.
One of these posts; not technical, written with a hot head, controversial topics. Don't read if that already annoys you.
The argument
I've had a hot discussion yesterday, and I'm not quite happy how it went. It's not important how we came there but the core argument was about why I think that less to no money should be spend on military and arms, while the same money should be spent on peace studies and education of diplomats - which I refer to as people who try to understand a different culture and initiate an exchange of values and ethics - not eloquent deceitful but open and direct, fair and square.
My opposition's opinion was different, claiming that we would never live in peace without an army enforcing the peace. That's an opinion that I could argue about all day long, but then he said what really is upsetting me; It's human nature to have wars and to only care for oneself.
I strongly disagree.
As for myself, I don't want that. Does this make me not a human? When we hear, read or see in the news what war crimes are committed, how people are forced to live (if they may live), what people are capable of doing to each other, we call this inhuman. What do we actually mean when we say so? Are the people committing these crimes not human people, or is it the act that is inhuman?
Again, my opponent argues that you should not call someone a liar, but rather say you lied - so to condemn the act of lying but not the person. Well, do we do that in other situations? May we call someone a murderer or a thief when they kill people or steal things? Is it a matter of frequency how often I eat meat while still calling myself a vegetarian?
Being human
The definition of what is or is not human may differ vastly. However, it is schizophenic that if we agree something is inhuman, to not take the consequences; As a human being stop acting inhuman!
Like being a vegetarian or not being a liar and murderer, for me being human is a continuous process that demands continuous work on ourselves. I have to work on myself to act how I want a human to act. When I think that the way meat is "produced" today on earth is inhuman, then I will have to change my diet. I have a vote, I can choose what to buy (or not to buy) or what to write on my blog. I can show people that I disagree and I can sit together with them, discuss differences of opinions and find a rational consent, because that's what I think how a human would act.
On a fun side note, my opponent also argued that animals also fight themselves, so this is just natural. Well, be an animal then.
This is the first post on my blog in a while. I guess this is coming after almost 2 years 9 months. Yes, never wrote those end-of-year review posts too.
I've been playing around with containers for a few years now. I find them very useful.
If you host your own, like I do, you probably write a lot of nginx configurations, maybe apache.
If that's the case, then you have your own solution to get certificates.
I'm also assuming that you are using let's encrypt with certbot or something.
Well, I didn't want to anymore. It was time to consolidate. Here comes Traefik.
an open-source Edge Router that makes publishing your services a fun and easy experience. It receives requests on behalf of your system and finds out which components are responsible for handling them.
Which made me realize, I still need nginx somewhere. We'll see when we get to it. Let's focus on Traefik.
Configuration
If you run a lot of containers and manage them, then you probably use docker-compose.
I'm still using version 2.3, I know I am due to an upgrade but I'm working on it slowly.
It's a bigger project… One step at a time.
Let's start from the top, literally.
---
version: '2.3'
services:
Note
Upgrading to version 3.x of docker-compose requires the creation of network to link containers together. It's worth investing into, this is not a docker-compose tutorial.
Let's Encrypt have set limits on how many certificates you can request per certain amount of time. To test your certificate request and renewal processes, use their staging infrastructure. It is made for such purpose.
Then we mount it, for persistence.
- "./traefik/acme.json:/acme.json"
Let's not forget to add our CloudflareAPI credentials as environment variables for Traefik to use.
With a little bit of Traefik documentation searching and a lot of help from htpasswd, we can create a basicauth login to protect the dashboard from public use.
[engine x] is an HTTP and reverse proxy server, a mail proxy server, and a generic TCP/UDP proxy server, originally written by Igor Sysoev.
In this example, we're going to assume you have a static blog generated by a static blog generator of your choice and you would like to serve it for people to read it.
So let's do this quickly as there isn't much to tell except when it comes to labels.
We are mounting the blog directory from our host to /usr/share/nginx/html/blog as read-only into the nginx container. We are also providing nginx with a template configuration and passing the variables as environment variables as you noticed. It is also mounted as read-only. The configuration template looks like the following, if you're wondering.
server {
listen ${NGINX_BLOG_PORT};
server_name localhost;
root /usr/share/nginx/html/${NGINX_BLOG_HOST};
location / {
index index.html;
try_files $uri $uri/ =404;
}
}
Traefik configuration
So, Traefik configuration at this point is a little bit tricky for the first time.
As my followers well know, by now, I am a tinkerer at heart. Why do I do things ? No one knows ! I don't even know.
All I know, all I can tell you is that I like to see what can I do with the tools I have at hand. How can I bend them to my will.
Why, you may ask. The answer is a bit complicated; part of who I am, part of what I do as a DevOps. End line is, this time I was curious.
I went down a road that taught me so much more about containers, docker, docker-compose and even Linux itself.
The question I had was simple, can I run a container only through Tor running in another container?
Tor
I usually like to start topics that I haven't mentioned before with definitions. In this case, what is Tor, you may ask ?
What is Tor?
Tor is free software and an open network that helps you defend against traffic analysis, a form of network surveillance that threatens personal freedom and privacy, confidential business activities and relationships, and state security.
Although that home page is obscure because it was replaced by the new design of the website.
Although I love what Tor has done with all the services they offer, don't get me wrong.
But giving so much importance on the browser only and leaving the rest for dead when it comes to website, I have to say, I'm a bit sad.
Anyway, let's share the love for Tor and thank them for the beautiful project they offered humanity.
Now that we thanked them, let's abuse it.
Tor in a container
The task I set to discover relied on Tor being containerized.
The first thing I do is, simply, not re-invent the wheel.
Let's find out if someone already took that task.
With a litte bit of search, I found the dperson/torproxy docker image.
It isn't ideal but I believe it is written to be rebuilt.
Can we run it ?
docker run -it -p 127.0.0.1:8118:8118 -d dperson/torproxy
curl -Lx http://localhost:8118 http://jsonip.com/
And this is definitely not your IP. Don't take my word for it!
Go to http://jsonip.com/ in a browser and see for yourself.
Now that we know we can run Tor in a container effectively, let's kick it up a notch.
docker-compose
I will be testing and making changes as I go along. For this reason, it's a good idea to use docker-compose to do this.
Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services. Then, with a single command, you create and start all the services from your configuration.
Now that we saw what the docker team has to say about docker-compose, let's go ahead and use it.
First, let's implement what we just ran ad-hoc in docker-compose.
Let's put it all together in a docker-compose.yaml file and run it.
docker-compose up -d
Keep that terminal open, and let's put the hypothesis to the test and see if rises up to be a theory.
docker exec air-gapped apt-get update
Aaaaand…
Err:1 http://archive.ubuntu.com/ubuntu focal InRelease
Temporary failure resolving 'archive.ubuntu.com'
Err:2 http://security.ubuntu.com/ubuntu focal-security InRelease
Temporary failure resolving 'security.ubuntu.com'
Err:3 http://archive.ubuntu.com/ubuntu focal-updates InRelease
Temporary failure resolving 'archive.ubuntu.com'
Err:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease
Temporary failure resolving 'archive.ubuntu.com'
Reading package lists...
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal/InRelease Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal-updates/InRelease Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal-backports/InRelease Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/focal-security/InRelease Temporary failure resolving 'security.ubuntu.com'
W: Some index files failed to download. They have been ignored, or old ones used instead.
looks like it's real peeps, hooray !
Putting everything together
Okay, now let's put everything together. The list of changes we need to make are minimal.
First, I will list them, then I will simply write them out in docker-compose.
Create an internet network for the Tor container
Attach the internet network to the Tor container
Attach the no-internet network to the Tor container so that our air-gapped container can access it.
Chatted with a few people about the state of Python and pip on Debian.
General notes
Looking back, I think I picked up a couple of new projects based on
random brain waves I had! That’s perfectly timed, because I’ve decided
to pivot away from my earlier approach of “yay, more responsibility!”.
What next?
This is getting increasingly harder to decide on, as my free time
chunks are becoming smaller and I’m picking up bigger projects. :)
Technical
Sphinx Theme PEP 517 stuff: Make the initial release.
sphinx-basic-ng: Make the first usable release.
pip: Clear some of the backlog on the pull request front.
pip: More progress on the documentation rewrite.
Communication
Spend more time looking into the Python lockfile standardisation effort.
Write a blog post, on automated code formatting.
Find more speaking opportunities, to talk about things that aren’t Python packaging!
I presented 2 talks at FOSDEM: in the Python devroom and
Open Source Design devroom. Shout-out to Bernard Tyers, for all the help and
the bazillion reminders to make sure I do all the things on time. :)
Collaborating on designing a lockfile format for Python, that can hopefully
be standardised for interoperability.
General notes
Onboarding in a new company, relocating internationally, settling into a new
space has been… well, it’s all been a very interesting learning experience.
Given the fairly strict lockdown and the percentage of people wearing masks
in my locality, I’ve spent a lots of time indoors. Looking forward to the
social weekends experiment I’m doing.
What next?
Technical
pip: Work on the documentation rewrite, hopefully to get it ready in time for
the next release.
pip: Clear some of the backlog on the pull request front.
pip: General discussions for new features and enhancements.
TOML: Work on writing that the compliance test suite.
TOML: Bring toml for Python back from the dead.
Furo: Make the first stable release.
Start work on the other Sphinx theme I have in mind.
Communication
Spend more time looking into the Python lockfile standardisation effort.
Catch up on the Python-on-Debian saga, and see how I can contribute
constructively.
I recently read this paper titled, Understanding Real-World Concurrency Bugs in Go (PDF), that studies concurrency bugs in Golang and comments on the new primitives for messages passing that the language is often known for.
I am not a very good Go programmer, so this was an informative lesson in various ways to achieve concurrency and synchronization between different threads of execution. It is also a good read for experienced Go developers as it points out some important gotchas to look out for when writing Go code. The fact that it uses real world examples from well known projects like Docker, Kubernetes, gRPC-Go, CockroachDB, BoltDB etc. makes it even more fun to read!
The authors analyzed a total of 171 concurrency bugs from several prominent Go open source projects and categorized them in two orthogonal dimensions, one each for the cause of the bug and the behavior. The cause is split between two major schools of concurrency
Along the cause dimension, we categorize bugs into those that are caused by misuse of shared memory and those caused by misuse of message passing
and the behavior dimension is similarly split into
we separate bugs into those that involve (any number of ) goroutines that cannot proceed (we call themblocking bugs) and those that do not involve any blocking (non-blocking bugs)
Interestingly, they chose the behavior to be blocking instead of deadlock since the former implies that atleast one thread of execution is blocked due to some concurrency bug, but the rest of them might continue execution, so it is not a deadlock situation.
Go has primitive shared memory protection mechanisms like Mutex, RWMutex etc. with a caveat
Write lock requests in Go have ahigher privilege than read lock requests.
as compared to pthread in C. Go also has a new primitive called sync.Once that can be used to guarantee that a function is executed only once. This can be useful in situations where some callable is shared across multiple threads of execution but it shouldn't be called more than once. Go also hassync.WaitGroups , which is similar to pthread_join to wait for various threads of executioun to finish executing.
Go also uses channels for the message passing between different threads of executions called Goroutunes. Channels can be buffered on un-buffered (default), the difference between them being that in a buffered channel the sender and receiver don't block on each other (until the buffered channel is full).
The study of the usage patterns of these concurrency primitives in various code bases along with the occurence of bugs in the codebase concluded that even though message passing was used at fewer places, it accounted for a larger number of bugs(58%).
Implication 1:With heavier usages of goroutines and newtypes of concurrency primitives, Go programs may potentiallyintroduce more concurrency bugs
Also, interesting to note is this observation in tha paper
Observation 5:All blocking bugs caused by message passing are related to Go’s new message passing semantics like channel. They can be difficult to detect especially when message passing operations are used together with other synchronization mechanisms
The authors also talk about various ways in which Go runtime can detect some of these concurrency bugs. Go runtime includes a deadlock detector which can detect when there are no goroutunes running in a thread, although, it cannot detect all the blocking bugs that authors found by manual inspection.
For shared memory bugs, Go also includes a data race detector which can be enbaled by adding -race option when building the program. It can find races in memory/data shared between multiple threads of execution and uses happened-before algorithm underneath to track objects and their lifecycle. Although, it can only detect a part of the bugs discovered by the authors, the patterns and classification in the paper can be leveraged to improve the detection and build more sophisticated checkers.
TLDR; Trying to learn new things I tried writing a URL shortner called shorty. This is a first draft and I am trying to approach it from first principle basis. Trying to break down everything to the simplest component.
I decided to write my own URL shortner and the reason for doing that was to dive a little more into golang and to learn more about systems. I have planned to not only document my learning but also find and point our different ways in which this application can be made scalable, resilient and robust.
A high level idea is to write a server which takes the big url and return me a short url for the same. I have one more requirement where I do want to provide a slug i.e a custom short url path for the same. So for some links like https://play.google.com/store/apps/details?id=me.farhaan.bubblefeed, I want to have a url like url.farhaan.me/linktray which is easy to remember and distribute.
The way I am thinking to implement this is by having two components, I want a CLI interface which talks to my Server. I don’t want a fancy UI for now because I want it to be exclusively be used through terminal. A Client-Server architecture, where my CLI client sends a request to the server with a URL and an optional slug. If a slug is present URL will have that slug in it and if it doesn’t it generates a random string and make the URL small. If you see from a higher level it’s not just a URL shortner but also a URL tagger.
The way a simple url shortner works:
Flow Diagram
A client makes a request to make a given URL short, server takes the URL and stores it to the database, server then generates a random string and maps the URL to the string and returns a URL like url.farhaan.me/<randomstring>.
Now when a client requests to url.farhaan.me/<randomstring>, it goest to the same server, it searches the original URL and redirects the request to a different website.
The slug implementation part is very straightforward, where given a word, I might have to search the database and if it is already present we raise an error but if it isn’t we add it in the database and return back the URL.
One optimization, since it’s just me who is going to use this, I can optimize my database to see if the long URL already exists and if it does then no need to create a new entry. But this should only happen in case of random string and not in case of slugs. Also this is a trade off between reducing the redundancy and latency of a request.
But when it comes to generating a random string, things get a tiny bit complicated. This generation of random strings, decides how many URLs you can store. There are various hashing algorithms that I can use to generate a string I can use md5, base10 or base64. I also need to make sure that it gives a unique hash and not repeated ones.
Unique hash can be maintained using a counter, the count either can be supplied from a different service which can help us to scale the system better or it can be internally generated, I have used database record number for the same.
If you look at this on a system design front. We are using the same Server to take the request and generate the URL and to redirect the request. This can be separated into two services where one service is required to generate the URL and the other just to redirect the URL. This way we increase the availability of the system. If one of the service goes down the other will still function.
The next step is to write and integrate a CLI system to talk to the server and fetch the URL. A client that can be used for an end user. I am also planning to integrate a caching mechanism in this but not something out of the shelf rather write a simple caching system with some cache eviction policy and use it.
Till then I will be waiting for the feedback. Happy Hacking.
I now have a Patreon open so that you folks can support me to do this stuff for longer time and sustain myself too. So feel free to subscribe to me and help me keeping doing this with added benefits.
TLDR; Link Tray is a utility we recently wrote to curate links from different places and share it with your friends. The blogpost has technical details and probably some productivity tips.
Link Bubble got my total attention when I got to know about it, I felt it’s a very novel idea, it helps to save time and helps you to curate the websites you visited. So on the whole, and believe me I am downplaying it when I say Link Bubble does two things:
Saves time by pre-opening the pages
Helps you to keep a track of pages you want to visit
It’s a better tab management system, what I felt weird was building a whole browser to do that. Obviously, I am being extremely naive when I am saying it because I don’t know what it takes to build a utility like that.
Now, since they discontinued it for a while and I never got a chance to use it. I thought let me try building something very similar, but my use case was totally different. Generally when I go through blogs or articles, I open the links mentioned in a different tab to come back to them later. This has back bitten me a lot of time because I just get lost in so many links.
I thought if there is a utility which could just capture the links on the fly and then I could quickly go through them looking at the title, it might ease out my job. I bounced off the same idea across to Abhishek and we ended up prototyping LinkTray.
Our first design was highly inspired by facebook messenger but instead of chatheads we have links opened. If you think about it the idea feels very beautiful but the design is “highly” not scalable. For example if you have as many as 10 links opened we had trouble in finding our links of interest which was a beautiful design problems we faced.
We quickly went to the whiteboard and put up a list of requirements, first principles; The ask was simple:
To share multiple links with multiple people with least transitions
To be able to see what you are sharing
To be able to curate links (add/remove/open links)
We took inspiration from an actual Drawer where we flick out a bunch of links and go through them. In a serendipitous moment the design came to us and that’s how link tray looks like the way it looks now.
Link Tray
Link Tray was a technical challenge as well. There is a plethora of things I learnt about the Android ecosystem and application development that I knew existed but never ventured into exploring it.
Link Tray is written in Java, and I was using a very loosely maintained library to get the overlay activity to work. Yes, the floating activity or application that we see is called an overlay activity, this allows the application to be opened over an already running application.
The library that I was using doesn’t have support for Android O and above. To figure that out it took me a few nights , also because I was hacking on the project during nights . After reading a lot of GitHub issues I figured out the problem and put in the support for the required operating system.
One of the really exciting features that I explored about Android is Services. I think I might have read most of the blogs out there and all the documentation available and I know that I still don't know enough. I was able to pick enough pointers to make my utility to work.
Just like Uncle Bob says make it work and then make it better. There was a persistent problem, the service needs to keep running in the background for it to work. This was not a functional issue but it was a performance issue for sure and our user of version 1.0 did have a problem with it. People got mislead because there was constant notification that LinkTray is running and it was annoying. This looked like a simple problem on the face but was a monster in the depth.
Architecture of Link Tray
The solution to the problem was simple stop the service when the tray is closed, and start the service when the link is shared back to link tray. Tried, the service did stop but when a new link was shared the application kept crashing. Later I figured out the bound service that is started by the library I am using is setting a bound flag to True but when they are trying to reset this flag , they were doing at the wrong place, this prompted me to write this StackOverflow answer to help people understand the lifecycle of service. Finally after a lot of logs and debugging session I found the issue and fixed it. It was one of the most exciting moment and it help me learn a lot of key concepts.
The other key learning, I got while developing Link Tray was about multi threading, what we are doing here is when a link is shared to link tray, we need the title of the page if it has and favicon for the website. Initially I was doing this on the main UI thread which is not only an anti-pattern but also a usability hazard. It was a network call which blocks the application till it was completed, I learnt how to make a network call on a different thread, and keep the application smooth.
Initially approach was to get a webview to work and we were literally opening the links in a browser and getting the title and favicon out, this was a very heavy process. Because we were literally spawning a browser to get information about links, in the initial design it made sense because we were giving an option to consume the links. Over time our design improved and we came to a point where we don’t give the option to consume but to curate. Hence we opted for web scraping, I used custom headers so that we don’t get caught by robot.txt. And after so much of effort it got to a place where it is stable and it is performing great.
It did take quite some time to reach a point where it is right now, it is full functional and stable. Do give it a go if you haven’t, you can shoot any queries to me.
So, recently I started using windows for work. Why? There are a couple of reasons, one that I needed to use MSVC, that is the Microsoft Visual C++ toolchain and the other being, I wasn’t quite comfortable to ifdef stuff for making it work on GCC aka, the GNU counterpart of MSVC.
After an anxious month, I am writing a Krita Weekly again and probably this would be my last one too, though I hope not. Let’s start by talking about bugs. Unlike the trend going about the last couple of months, the numbers have taken a serious dip.
[Published in Open Source For You (OSFY) magazine, October 2017 edition.]
This article is the eighth in the DevOps series. In this issue, we shall learn to set up Docker in the host system and use it with Ansible.
Introduction
Docker provides operating system level virtualisation in the form of containers. These containers allow you to run standalone applications in an isolated environment. The three important features of Docker containers are isolation, portability and repeatability. All along we have used Parabola GNU/Linux-libre as the host system, and executed Ansible scripts on target Virtual Machines (VM) such as CentOS and Ubuntu.
Docker containers are extremely lightweight and fast to launch. You can also specify the amount of resources that you need such as CPU, memory and network. The Docker technology was launched in 2013, and released under the Apache 2.0 license. It is implemented using the Go programming language. A number of frameworks have been built on top of Docker for managing these cluster of servers. The Apache Mesos project, Google’s Kubernetes, and the Docker Swarm project are popular examples. These are ideal for running stateless applications and help you to easily scale them horizontally.
Setup
The Ansible version used on the host system (Parabola GNU/Linux-libre x86_64) is 2.3.0.0. Internet access should be available on the host system. The ansible/ folder contains the following file:
ansible/playbooks/configuration/docker.yml
Installation
The following playbook is used to install Docker on the host system:
The Parabola package repository is updated before proceeding to install the dependencies. The python2-docker package is required for use with Ansible. Hence, it is installed along with the docker package. The Docker daemon service is then started and the library/hello-world container is fetched and executed. A sample invocation and execution of the above playbook is shown below:
With verbose ’-v’ option to ansible-playbook, you will see an entry for LogPath, such as /var/lib/docker/containers//-json.log. In this log file you will see the output of the execution of the hello-world container. This output is the same when you run the container manually as shown below:
$ sudo docker run hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://cloud.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/engine/userguide/
Example
A Deep Learning (DL) Docker project is available (https://github.com/floydhub/dl-docker) with support for frameworks, libraries and software tools. We can use Ansible to build the entire DL container from the source code of the tools. The base OS of the container is Ubuntu 14.04, and will include the following software packages:
Tensorflow
Caffe
Theano
Keras
Lasagne
Torch
iPython/Jupyter Notebook
Numpy
SciPy
Pandas
Scikit Learn
Matplotlib
OpenCV
The playbook to build the DL Docker image is given below:
We first clone the Deep Learning docker project sources. The docker_image module in Ansible helps us to build, load and pull images. We then use the Dockerfile.cpu file to build a Docker image targeting the CPU. If you have a GPU in your system, you can use the Dockerfile.gpu file. The above playbook can be invoked using the following command:
Depending on the CPU and RAM you have, it will take considerable amount of time to build the image with all the software. So be patient!
Jupyter Notebook
The built dl-docker image contains Jupyter notebook which can be launched when you start the container. An Ansible playbook for the same is provided below:
- name: Start Jupyter notebook
hosts: localhost
gather_facts: true
become: true
tags: [notebook]
vars:
DL_DOCKER_NAME: "floydhub/dl-docker"
tasks:
- name: Run container for Jupyter notebook
docker_container:
name: "dl-docker-notebook"
image: "{{ DL_DOCKER_NAME }}:cpu"
state: started
command: sh run_jupyter.sh
You can invoke the playbook using the following command:
The Dockerfile already exposes the port 8888, and hence you do not need to specify the same in the above docker_container configuration. After you run the playbook, using the ‘docker ps’ command on the host system, you can obtain the container ID as indicated below:
$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a876ad5af751 floydhub/dl-docker:cpu "sh run_jupyter.sh" 11 minutes ago Up 4 minutes 6006/tcp, 8888/tcp dl-docker-notebook
You can now login to the running container using the following command:
$ sudo docker exec -it a876 /bin/bash
You can then run an ‘ifconfig’ command to find the local IP address (“172.17.0.2” in this case), and then open http://172.17.0.2:8888 in a browser on your host system to see the Jupyter Notebook. A screenshot is shown in Figure 1:
TensorBoard
TensorBoard consists of a suite of visualization tools to understand the TensorFlow programs. It is installed and available inside the Docker container. After you login to the Docker container, at the root prompt, you can start Tensorboard by passing it a log directory as shown below:
# tensorboard --logdir=./log
You can then open http://172.17.0.2:6006/ in a browser on your host system to see the Tensorboard dashboard as shown in Figure 2:
Docker Image Facts
The docker_image_facts Ansible module provides useful information about a Docker image. We can use it to obtain the image facts for our dl-docker container as shown below:
The ANSIBLE_STDOUT_CALLBACK environment variable is set to ‘json’ to produce a JSON output for readability. Some important image facts from the invocation of the above playbook are shown below:
[NOTE: This post originally appeared on
deepsource.io, and has been
posted here with due permission.]
In the early part of the last century, when David Hilbert was working on
stricter formalization of geometry than Euclid, Georg Cantor had worked out a
theory of different types of infinities, the theory of sets. This
theory would soon unveil a series of confusing paradoxes, leading to
a crisis in the Mathematics community regarding the stability of the
foundational principles of the math of that time.
Central to these paradoxes was the Russell’s paradox (or more generally, as
we’d talk about later, the Epimenides Paradox). Let’s see what it is.
In those simpler times, you were allowed to define a set if you could describe
it in English. And, owing to mathematicians’ predilection for self-reference,
sets could contain other sets.
Russell then, came up with this:
\(R\) is a set of all the sets which do not contain themselves.
The question was "Does \(R \) contain itself?" If it doesn’t, then according to
the second half of the definition it should. But if it does, then it no longer
meets the definition.
The same can symbolically be represented as:
Let \(R = \{ x \mid x \not \in x \} \), then \(R \in R \iff R \not \in R \)
Cue mind exploding.
“Grelling’s paradox” is a startling variant which uses adjectives instead of
sets. If adjectives are divided into two classes, autological
(self-descriptive) and heterological (non-self-descriptive), then, is
‘heterological’ heterological? Try it!
Epimenides Paradox
Or, the so-called Liar Paradox was another such paradox which shred apart
whatever concept of ‘computability’ was, at that time - the notion that things
could either be true or false.
Epimenides was a Cretan, who made one immortal statement:
“All Cretans are liars.”
If all Cretans are liars, and Epimenides was a Cretan, then he was lying when
he said that “All Cretans are liars”. But wait, if he was lying then, how can
we ‘prove’ that he wasn’t lying about lying? Ein?
This is what makes it a paradox: A statement so rudely violating the assumed
dichotomy of statements into true and false, because if you tentatively think
it’s true, it backfires on you and make you think that it is false. And a
similar backfire occurs if you assume that the statement is false. Go ahead,
try it!
If you look closely, there is one common culprit in all of these paradoxes,
namely ‘self-reference’. Let’s look at it more closely.
Strange Loopiness
If self-reference, or what Douglas Hofstadter - whose prolific work on the
subject matter has inspired this blog post - calls ‘Strange Loopiness’ was the
source of all these paradoxes, it made perfect sense to just banish
self-reference, or anything which allowed it to occur. Russell and Whitehead,
two rebel mathematicians of the time, who subscribed to this point of view, set
forward and undertook the mammoth exercise, namely “Principia Mathematica”,
which we as we will see in a little while, was utterly demolished by Gödel’s
findings.
The main thing which made it difficult to ban self-reference was that it was
hard to pin point where exactly did the self-reference occur. It may as well be
spread out over several steps, as in this ‘expanded’ version of Epimenides:
The next statement is a lie.
The previous statement is true.
Russell and Whitehead, in P.M. then, came up with a multi-hierarchy set
theory to deal with this. The basic idea was that a set of the lowest ‘type’
could only contain ‘objects’ as members (not sets). A set of the next type
could then only either contain objects, or sets of lower types. This,
implicitly banished self-reference.
Since, all sets must have a type, a set ‘which contains all sets which are not
members of themselves’ is not a set at all, and thus you can say that Russell’s
paradox was dealt with.
Similarly, if an attempt is made towards applying the expanded Epimenides to
this theory, it must fail as well, for the first sentence to make a reference
to the second one, it has to be hierarchically above it - in which case, the
second one can’t loop back to the first one.
Thirty one years after David Hilbert set before the academia to rigorously
demonstrate that the system defined in Principia Mathematica was both
consistent (contradiction-free) and complete (i.e. every true statement
could be evaluated to true within the methods provided by P.M.), Gödel
published his famous Incompleteness Theorem. By importing the Epimenides
Paradox right into the heart of P.M., he proved that not just the
axiomatic system developed by Russell and Whitehead, but none of the
axiomatic systems whatsoever were complete without being inconsistent.
Clear enough, P.M. lost it’s charm in the realm of academics.
Before Gödel’s work too, P.M. wasn’t particularly loved as well.
Why?
It isn’t just limited to this blog post, but we humans, in general, have a diet
for self-reference - and this quirky theory severely limits our ability to
abstract away details - something which we love, not only as programmers, but
as linguists too - so much so, that the preceding paragraph, “It isn’t …
this blog … we humans …” would be doubly forbidden because the ‘right’
to mention ‘this blog post’ is limited only to something which is
hierarchically above blog posts, ‘metablog-posts’. Secondly, me (presumably a
human) belonging to the class ‘we’ can’t mention ‘we’ either.
Since, we humans, love self-reference so much, let’s discuss some ways in which
it can be expressed in written form.
One way of making such a strange loop, and perhaps the ‘simplest’ is using the
word ‘this’. Here:
This sentence is made up of eight words.
This sentence refers to itself, and is therefore useless.
This blog post is so good.
This sentence conveys you the meaning of ‘this’.
This sentence is a lie. (Epimenides Paradox)
Another amusing trick for creating a self-reference without using the word
‘this sentence’ is to quote the sentence inside itself.
Someone may come up with:
The sentence ‘The sentence contains five words’ contains five words.
But, such an attempt must fail, for to quote a finite sentence inside itself
would mean that the sentence is smaller than itself. However, infinite
sentences can be self-referenced this way.
The sentence
"The sentence
"The sentence
...etc
...etc
is infinitely long"
is infinitely long"
is infinitely long"
There’s a third method as well, which you already saw in the title - the Quine
method. The term ‘Quine’ was coined by Douglas Hofstadter in his book “Gödel
Escher, Bach” (which heavily inspires this blog post). When using this, the
self-reference is ‘generated’ by describing a typographical entity, isomorphic
to the quine sentence itself. This description is carried in two parts - one is
a set of ‘instructions’ about how to ‘build’ the sentence, and the other, the
‘template’ contains information about the construction materials required.
The Quine version of Epimenides would be:
“yields falsehood when preceded by it’s quotation” yields falsehood when preceded by it’s quotation
Before going on with ‘quining’, let’s take a moment and realize how awfully
powerful our cognitive capacities are, and what goes in our head when a
cognitive payload full of self-references is delivered - in order to decipher
it, we not only need to know the language, but also need to work out the
referent of the phrase analogous to ‘this sentence’ in that language. This
parsing depends on our complex, yet totally assimilated ability to handle the
language.
The idea of referring to itself is quite mind-blowing, and we keep doing it all
the time — perhaps, why it feels so ‘easy’ for us to do so. But, we aren’t born
that way, we grow that way. This could better be realized by telling someone
much younger “This sentence is wrong.”. They’d probably be confused - What
sentence is wrong?. The reason why it’s so simple for self-reference to occur,
and hence allow paradoxes, in our language, is well, our language. It allows
our brain to do the heavy lifting of what the author is trying to get through
us, without being verbose.
Back to Quines.
Reproducing itself
Now, that we are aware of how ‘quines’ can manifest as self-reference, it would
be interesting to see how the same technique can be used by a computer program
to ‘reproduce’ itself.
To make it further interesting, we shall choose the language most apt for the
purpose - brainfuck:
Running that program above produces itself as the output. I agree, it isn’t the
most descriptive program in the world, so written in Python below, is the
nearest we can go to describe what’s happening inside those horrible chains of
+’s and >’s:
The first line generates """ on the fly, which marks multiline strings in
Python.
Next two lines define the eniuq function, which prints the argument template
twice - once, plain and then surrounded with triple quotes.
The last 4 lines cleverly call this function so that the output of the program
is the source code itself.
Since we are printing in an order opposite of quining, the name of the function
is ‘quine’ reversed -> eniuq (name stolen from Hofstadter again)
Remember the discussion about how self-reference capitalizes on the processor?
What if ‘quining’ was a built-in feature of the language, providing what we in
programmer lingo call ‘syntactic sugar’?
Let’s assume that an asterisk, * in the brainfuck interpreter would copy the
instructions before executing them, what would then be the output of the
following program?
*
It’d be an asterisk again. You could make an argument that this is silly, and
should be counted as ‘cheating’. But, it’s the same as relying on the
processor, like using “this sentence” to refer to this sentence - you rely on
your brain to do the inference for you.
What if eniuq was a builtin keyword in Python? A perfect self-rep was then
just be a call away:
eniuq('eniuq')
What if quine was a verb in the English language? We could reduce a lot of
explicit cognitive processes required for inference. The Epimenides paradox
would then be:
“yields falsehood if quined” yields falsehood if quined
Now, that we are talking about self-rep, here’s one last piece of entertainment
for you.
If you take that absurd thing above, and move around in the cartesian plane for
the coordinates \(0 \le x \le 106, k \le y \le k + 17\), where \(k\) is a
544 digit integer (just hold on with me here), color every pixel black for
True, and white otherwise, you'd get: