In my case the story continued for around 2 hours. Yesterday I was trying to
implement something from a given SPEC, and tried to match my output (from Rust)
with the output from the Python code written by Dr. Fett.
The problem
I had to get the JSON encoding from an ordered Array (in Python it is a simple list), say ["kushal", "das"], and get the base64urlencode(sha256sum(JSON_ENCODED_STRING)) in Rust using serde_json.
Easy, isn’t?
Let us see what the encoded string looks like:
"[\"6Ij7tM-a5iVPGboS5tmvVA\",\"John\"]"
And the final checksum is otcxXDbO4_xzK0DXL_DO2INIFXIJdO1PqVM-n_gt-3g.
But, the reference implementation in Python has the checksum as fUMdn88aaoyKTHrvZd6AuLmPraGhPJ0zF5r_JhxCVZs.
It took me a few hours to notice the space after comma in the JSON encoded string from Python:
I’m serving as the Enhancements Role Lead for the current Kubernetes 1.25 Release Team.
As a role lead this time, I have a group of five outstanding shadows that I am not only mentoring to become future leads, but I am also learning from them - both “how to teach” and “how to learn”
I haven’t posted in a long time (ups & downs & new roles & responsibilities & sometimes you don’t feel like doing anything at all & it literally takes all the energy even to do the required)
So, just adding that I was also an Enhancements Shadow on the Kubernetes 1.24 Release Team, and my former role lead, Grace Ngyuen, nominated me to be the next role lead at the conclusion of the previous release cycle.
When I look back on my time throughout these three cycles, I’m amazed at how much I’ve learned. It’s been a great experience. 🙂 Not only did I learn, but I also felt recognized.
Currently, We’re at Week 4 of the 1.25 release cycle, and it’s one of the busiest for the Enhancements role (as we’re almost approaching Enhancements Freeze in a week). I would say, we’re doing good so far! 😄
And one more thing before I finish up this small post!
I got to go to my first ever KubeCon event in person!
I had the opportunity to attend the KubeCon EU 2022 event in Valencia, Spain (my first ever international travel as well). I was astonished that so many people knew who I was (anything more than zero was “so many” for me) and that I already belonged to a tiny group of people. It was an incredible feeling.
I’m not a very photo photo person, but sharing some 🙂
No wonder we’re a bit dizzy. We just multiplied our minds by many orders of magnitude. It’s easy to confuse someone else’s memory (or manipulation) with our hard-earned ability to remember things that actually happened to us.
And we’re now realizing that we have the power (and perhaps the obligation) to use shared knowledge to make better, more thoughtful decisions. And to intentionally edit out the manipulations and falsehoods that are designed to spread, not to improve our lives.
When considering data privacy and protections, there is no data more important than personal data, whether that’s medical, financial, or even social. The discussions around access to our data, or even our metadata, becomes about who knows what, and if my personal data is safe. Today’s announcement between Intel, Microsoft, and DARPA, is a program designed around keeping information safe and encrypted, but still using that data to build better models or provide better statistical analysis without disclosing the actual data. It’s called Fully Homomorphic Encryption, but it is so computationally intense that the concept is almost useless in practice.
So whether that means combining hospital medical records over a state, or customizing a personal service using personal metadata gathered on a user’s smartphone, FHE at that scale is no longer a viable solution. Enter the DARPA DPRIVE program.
DARPA: Defense Advanced Research Projects Agency
DPRIVE: Data Protection in Virtual Environments
Intel has announced that as part of the DPRIVE program, it has signed an agreement with DARPA to develop custom IP leading to silicon to enable faster FHE in the cloud, specifically with Microsoft on both Azure and JEDI cloud, initially with the US government. As part of this multi-year project, expertise from Intel Labs, Intel’s Design Engineering, and Intel’s Data Platforms Group will come together to create a dedicated ASIC to reduce the computational overhead of FHE over existing CPU-based methods. The press release states that the target is to reduce processing time by five orders of magnitude from current methods, reducing compute times from days to minutes.
Excuse me a moment—I am going to be bombastic, over excited, and possibly annoying. The race is run, and we have a winner in the future of quantum computing. IBM, Google, and everyone else can turn in their quantum computing cards and take up knitting.
One key to quantum computing (or any computation, really) is the ability to change a qubit’s state depending on the state of another qubit. This turned out to be doable but cumbersome in optical quantum computing. Typically, a two- (or more) qubit operation is a nonlinear operation, and optical nonlinear processes are very inefficient. Linear two-qubit operations are possible, but they are probabilistic, so you need to repeat your calculation many times to be sure you know which answer is correct. A second critical feature is programmability. It is not desirable to have to create a new computer for every computation you wish to perform. Here, optical quantum computers really seemed to fall down. An optical quantum computer could be easy to set up and measure, or it could be programmable—but not both.
So, what has changed to suddenly make optical quantum computers viable? One is the appearance of detectors that can resolve the number of photons they receive. A second key development was integrated optical circuits. performance has gotten much, much better. Integrated optics are now commonly used in the telecommunications industry, with the scale and reliability that that implies.
The researchers, from a startup called Xanadu and the National Institute of Standards, have pulled together these technology developments to produce a single integrated optical chip that generates eight qubits. The internal setting of the interferometer is the knob that the programmer uses to control the computation. In practice, the knob just changes the temperature of individual waveguide segments. But the programmer doesn’t have to worry about these details. Instead, they have an application programming interface (Strawberry Fields Python Library) that takes very normal-looking Python code. This code is then translated by a control system that maintains the correct temperature differentials on the chip.
What is more, the scaling does not present huge amounts of increased complexity. In superconducting qubits, each qubit is a current loop in a magnetic field. Each qubit generates a field that talks to all the other qubits all the time. Engineers have to take a great deal of trouble to decouple and couple qubits from each other at the right moment. The larger the system, the trickier that task becomes. Ion qubit computers face an analogous problem in their trap modes. There isn’t really an analogous problem in optical systems, and that is their key advantage.
Two exhaustive articles on the historical significance of the Arpanet and how the protocol worked
This is what was totally new about the ARPANET. The ICCC demonstration didn’t just involve a human communicating with a distant computer. It wasn’t just a demonstration of remote I/O. It was a demonstration of software remotely communicating with other software, something nobody had seen before.
So what I’m trying to drive home here is that there is an important distinction between statement A, “the ARPANET connected people in different locations via computers for the first time,” and statement B, “the ARPANET connected computer systems to each other for the first time.” That might seem like splitting hairs, but statement A elides some illuminating history in a way that statement B does not.
In a section with the belabored title, “Technical Aspects of the Effort Which Were Successful and Aspects of the Effort Which Did Not Materialize as Originally Envisaged,” the authors wrote:
Possibly the most difficult task undertaken in the development of the ARPANET was the attempt—which proved successful—to make a number of independent host computer systems of varying manufacture, and varying operating systems within a single manufactured type, communicate with each other despite their diverse characteristics.
There you have it from no less a source than the federal government of the United States.
Long, random passwords just aren’t convenient. If you need to enter 45 randomly-generated characters on another device often enough, you’ll inevitably change that password to something like password123 because it’s easy to type and remember. It’s also - you got it - not strong.
While a lengthy, unintelligible password may appear stronger than a smart one, it’s mainly illusion. Pronounceable syllables make a smart password look human generated and, therefore, weaker. But a human-generated password could never be chosen uniformly and, therefore, can’t be accurately assessed for entropy.
We’ve made a compromise of sorts. We’ve sacrificed a few bits of (theoretical) entropy, that don’t affect real-world security, to gain a whole lot of convenience, compatibility, and accessibility — and those certainly are real world, which is what really matters.
We realized that CI is more sensitive than most users for most of the site. So we focused in on testing the highest impact code. What’s high-impact? 1) the code that fails most visibly and 2) the code that’s hardest to retry. You can build an inventory of high-impact code in under a week by looking at traffic stats, batch job schedules, and asking your support staff.
…
We realized that CI is more sensitive than most users for most of the site. So we focused in on testing the highest impact code. What’s high-impact? 1) the code that fails most visibly and 2) the code that’s hardest to retry. You can build an inventory of high-impact code in under a week by looking at traffic stats, batch job schedules, and asking your support staff.
And it really is important to develop close ties with your support team. Embedded in our strategy above was that CI is much more sensitive than a real user. While perfection is tempting, it’s not unrealistic to ask a bit of patience from an enterprise user, provided your support team is prepared. Sync with them weekly so surprise is minimized. If they’re feeling ambitious, you can teach them some Sentry basics, too.
My main takeaways
Everyone has a plan ’till they get punched in the mouth, — Mike Tyson
Prioritise work on what actually matters. Perfection can wait.
People and their feedback comes first. Matters a lot more than data driven decisions. After all, software is used by by and for people, in the first place.
You’re adapting my what?
When activated, adaptive cruise control uses forward-looking radar to maintain a specific distance to a vehicle in the lane ahead, slowing down or speeding up (to a maximum of whatever speed cruise control was set to) as necessary. Lane-keeping systems use forward-looking cameras to detect the lane markings on a road to keep the vehicle between them, and when both are active together, the vehicle will do a pretty good facsimile of driving itself, albeit with extremely limited situational awareness.
Which is where the human comes in. Under the SAE’s definitions for automated driving, in Level 2 the car controls braking, acceleration, and deceleration, but the human is responsible for providing situational awareness at all times.
Of course, this raises the question of whether the driver is actually paying attention.
To test whether drivers were actually paying attention while using a Level 2 system, IIHS recruited participants and then had them drive for roughly an hour, either using the car’s Level 2 system or not. At three predetermined locations on the test route, a second car—the one with the large pink bear attached to its trunk—would overtake the participant’s vehicle. At the end of the study, the drivers were asked if they saw anything odd, and if so, how many times.
In the second image, there are three ‹N›’s. Yet, they all look exactly the same. A real typewriter can, quite rarely, have one of its letters damaged, or misaligned, such that that letter regularly makes an inferior strike to all the other letters. However, this degree of regularity is impossible; could Underwood or Remington have acheived it, they would have leapt for joy.
While working on the project, incredibly, another bad typewriter scene intruded upon my life. I don’t often sit around and watch movies, so I suppose there are only two possibilities:
a. There are so many of these unrealistic typewritten documents in late-2010’s cinema that almost any movie with a typewritten document in it will be hopelessly unrealistic, or
b. The universe, nay, God himself, was urging me on to complete this project in lieu of others I could finish!
In general, we don’t currently have the technology to image exoplanets unless they’re very large, very young, and a considerable distance from the star they orbit. Yet we can still get some sense of what’s in their atmosphere. To do that, we need to observe a planet that transits across the line of sight between Earth and its star. During a transit, a small percentage of the star’s light will travel through the planet’s atmosphere on its way to Earth, interacting with the molecules present there.
Those molecules leave a signature on the spectrum of light that reaches the Earth. It’s an extremely faint signature, since most of the star’s light never even sees the atmosphere. But by combining the data from a number of days of observation, it’s possible to get this signature to stand out from the noise.
That’s what scientists have done with GJ 1132 b, an exoplanet that orbits a small star about 40 light years from Earth. The planet is roughly Earth’s size and about 1.5 times its mass. It also orbits extremely close to its host star, completing a full orbit in only 1.6 days. That’s close enough to ensure that, despite the small, dim star, GJ 1132 b is extremely hot.
It’s so close and hot, in fact, that the researchers estimate that it’s currently losing about 10,000 kilograms of atmosphere every second. As the host star was expected to be brighter early in its history, the researchers estimate that GJ 1132 b would have lost a reasonable-sized atmosphere within the first 100 million years of its existence. In fact, over the life of the planet, the researchers estimate that it could have lost an atmosphere weighing in at about five times the planet’s current mass—the sort of thing you might see if the remaining planet were the core of a mini-Neptune.
So, researchers were probably surprise to find that, based on data from the Hubble, the planet seems to have an atmosphere. How’d that get here?
It was Patrick Collison, Stripe’s CEO, who pointed out to me that one of the animating principles of early 20th-century Progressivism was guaranteeing freedom of expression from corporations:
Exactly the same kind of restraints upon freedom of thought are bound to occur in every country where economic organization has been carried to the point of practical monopoly. Therefore the safeguarding of liberty in the world which is growing up is far more difficult than it was in the nineteenth century, when free competition was still a reality. Whoever cares about the freedom of the mind must face this situation fully and frankly, realizing the inapplicability of methods which answered well enough while industrialism was in its infancy.
This is why I take Smith’s comments as more of a warning: a commitment to consistency may lead to the lowest common denominator outcome Prince fears, where U.S. social media companies overreach on content, even as competition is squeezed out at the infrastructure level by policies guided by non-U.S. countries. It’s a bad mix, and public clouds in particular would be better off preparing for geographically-distinct policies in the long run, even as they deliver on their commitment to predictability and process in the meantime, with a strong bias towards being hands-off. That will mean some difficult decisions, which is why it’s better to make a commitment to neutrality and due process now.
Excel may be the most influential software ever built. It is a canonical example of Steve Job’s bicycle of the mind, endowing its users with computational superpowers normally reserved for professional software engineers. Armed with those superpowers, users can create fully functional software programs in the form of a humble spreadsheet to solve problems in a seemingly limitless number of domains. These programs often serve as high-fidelity prototypes of domain specific applications just begging to be brought to market in a more polished form.
If you want to see the future of B2B software, look at what Excel users are hacking together in spreadsheets today.
The disconnect occurs when producers and creators try to average things out and dumb things down, hoping for the big hit that won’t come. Or overspend to get there. The opportunity lies in finding a viable audience and matching the project’s focus and budget to the people who truly want it.
It clarifies your thinking.
It’s a project that is completely and totally up to you.
Because it’s a generous way to share.
It will increase your authority in your field.
Some developers I’ve known seem to think that being good at concurrency makes them badass. Others seem to think that senior developers must be great at concurrency, and so they should be too.
But what senior developers are good it is eliminating concurrency as much as possible by developing a simple, easy, consistent model to follow for the app and its components.
And this is because concurrency is too difficult for humans to understand and maintain. Maybe you can create a system that makes extensive use of it, and have it be correct for one day. But think of your team! Even if you’re a solo developer, you and you-plus-six-months makes you a team.
I know you’re worried about blocking the main thread. But consider this: it’s way easier to fix a main-thread-blocker than it is to fix a weird, intermittent bug or crash due to threading.
Experts don’t prove themselves by using every piece of the spec; they prove themselves by knowing the spec well enough to deploy syntax judiciously and make well-reasoned decisions. This is how experts become multipliers—how they make new experts.
So what does this mean for those of us who consider ourselves experts or aspiring experts? It means that writing code involves asking yourself a lot of questions. It means considering your developer audience in a real way. The best code you can write is code that accomplishes something complex, but is inherently understood by those who examine your codebase.
And no, it’s not easy. And there often isn’t a clear-cut answer. But it’s something you should consider with every function you write.
Len Kleinrock: The First Two Packets on the Internet
First, I was at Digital (DEC) for 15 years, right! Now that was a different career because I was in the mid-range group where we built computers out of ECL - these were refrigerator-sized boxes. I was in the DEC Alpha team where we built little microprocessors, little teeny things, which at the time we thought were huge. These were 300 square millimeters at 50 watts, which blew everybody’s mind.
So I was there for a while, and I went to AMD right during the internet rush, and we did a whole bunch of stuff in a couple of years. We started Opteron, HyperTransport, 2P servers - it was kind of a whirlwind of a place. But I got sucked up or caught up in the enthusiasm of the internet, and I went to SiByte, which got bought by Broadcom, and I was there for four years total. We delivered several generations of products.
I was then at P.A Semi, and we delivered a great product, but they didn’t really want to sell the product for some reason, or they thought they were going to sell it to Apple. I actually went to Apple, and then Apple bought P.A Semi, and then I worked for that team, so you know I was between P.A Semi and Apple. That was seven years, so I don’t really feel like that was jumping around too much.
Then I jumped to AMD I guess, and that was fun for a while. Then I went to Tesla where we delivered Hardware 3 (Tesla Autopilot). So that was kind of phenomenal. From a standing start to driving a car in 18 months - I don’t think that’s ever been done before, and that product shipped really successfully. They built a million of them last year. Tesla and Intel were a different kind of a whirlwind, so you could say I jumped in and jumped out. I sure had a lot of fun.
But what is Network Time Security (NTS)? To understand NTS, first we have to get familiar with NTP.
NTP, Network Time Protocol is a networking protocol for clock synchronization. In simple terms this protocol helps your system to get correct time. RFC 5905 lays the ground rule for NTP.
Now, NTS is an improvement over NTP. It is made of two different parts.
NTS key exchange: When using TLS the NTP client and server exchanges key materials.
NTS authentication: This part ensures that the time synchronization packets are authenticated using the key materials from the step one.
To read more on the subject I found this blog post from Netnod. While this is an useful resource, but this does not explain how to use NTS on my Fedora laptop, which uses chrony for time sync.
Which servers to use?
I am using servers from 2 providers - Netnod and Cloudflare. Since I am based in Sweden I will be using :
sth1.nts.netnod.se
sth2.nts.netnod.se
Both of above mentioned servers are for the users located near Stockholm.
For everyone who are not close enough to Stockholm they can use
nts.netnod.se
I will further be using the NTS server from Cloudflare, because they have their servers at 180 cities around the globe.
time.cloudflare.com
Update /etc/chrony.conf
I am adding the following configuration to my /etc/chrony.conf. My configuration adds the NTS servers to use and further disables NTP servers received by DHCP.
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (https://www.pool.ntp.org/join.html).
server sth1.nts.netnod.se iburst nts
server sth2.nts.netnod.se iburst nts
server nts.netnod.se iburst nts
server time.cloudflare.com iburst nts
# Use NTP servers from DHCP.
#sourcedir /run/chrony-dhcp
Now restart the chronyd service, sudo systemctl restart chronyd.
We can verify that the system is using NTS with the following command:
Mode : Which authentication mechanism has been used. In our case it should be NTS.
KeyID: it is a number starting at zero and incremented by one with each successful key establishment using the NTS-KE protocol,
Type: Which algorithm is used for authentication. Here 15 means AEAD-AES-SIV-CMAC-256
KLen: Length of the key in bits.
Last: How long ago the last successful key establishment has been taken place. The time is mentioned in seconds. The letters m, h, d or y indicate minutes, hours, days, or years.
Atmp: Number of attempts to perform the key establishment since the last successful key establishment. Any number larger than 1 indicates a problem with the network or server.
NAK: If an NTS NAK was received since the last request.
Cook: Number of NTS cookies chrony currently possesses.
CLen: Length of the last cookie used in bytes.
Ubuntu 22.04 uses systemd-timesyncd, which still does not support NTS. One can follow the discussion here.
dgplug summer training 2022 will start at 13:00 UTC, 25th July. This is our 15th edition. Following our tradition, we have updated and modified our training based on the feedback from the old participants and community at large. There shall be more experiments for the betterment during this year’s training as well.
What are we going to do this (late) summer?
Together we will learn how to become an upstream contributor in the Free and Open Source community. The prime focus for summer training is always communication. We will learn how to efficiently use this basic skill to perform the technical tasks on hand. At the same time we shall dig into the ethical side of the technology. Helping people to learn about security and privacy (both as contributor & as a responsible part of the human race) is at very core of our training. One can read more about the training in general from the FAQ.
Who all can be a part of the training?
Anyone and everyone who wants to be a part of the Free and Open Source community can join our training. We have people from variant educational, age, geographical background in all our previous training editions. So, we expect the same and more diverse people this year as well.
What to expect when you come to the training?
Dgplug Summer Training is not a boot camp. The only promise we do make to attendees that together we will learn how to become an upstream contributor and give back to the community.
Experience of past attendees
“Learn and teach others” is our moto. People who once were attendees are now mentors. Here are some memoir of our past attendees :
Yesterday, I came across a tweet by Sara Soueidan, which resonated with me. Mostly because I have had this discussion (or heated arguments) quite a few times with many folks. Please go and read her tweet thread since she mentions some really great points about why progressive enhancement is not anti-js. As someone who cares about security, privacy, and accessibility, I have always been an advocate of progressive enhancement. I always believe that a website (or any web-based solution) should be accessible even without JavaScript in the browser. And more often than not, people take me as someone who is anti-JavaScript. Well, let me explain with the help (a lot of help) of resources already created by other brilliant folks.
What is Progressive Enhancement?
Progressive enhancement is the idea of making a very simple, baseline foundation for a website that is accessible and usable by all users irrespective of their input/output devices, browsers (or user-agents), or the technology they are using. Then, once you have done that, you sprinkle more fancy animations and custom UI on top that might make it look more beautiful for users with the ideal devices.
If you saw the video by Heydon, I am sure you are starting to get some idea. Here I am going to reference another video titled Visual Styling vs. Semantic Meaning, which was created by Manuel Matuzović. I love how, in this video Manuel shares the idea of building first semantically and then visually styling it.
So I think a good way to do progressive enhancement according to me is:
Start with HTML - This is a very good place to start, because not only does this ensure that almost all browsers and user devices can render this, but also it helps you think semantically instead of based on the visual design. That already starts making your website not only good for different browsers, but also for screen reader and assistive technology users.
Add basic layout CSS progressively - This is the step where you start applying visual designs. But only the basic layouts. This progressively enhances the visual look of the website, and also you can add things like better focus styles, etc. Be careful and check caniuse.com to add CSS features that are well supported across most browsers in different versions. Remember what Heydon said? "A basic Layout is not a broken layout".
Add fancy CSS progressively - Add more recent CSS features for layouting and to progressively enhance the visual styling of your website. Here you can add much more newer features that make the design look even more perfect.
Add fancy JavaScript sparkles progressively - If there are animations, and interactions that you would like the user to have that is not possible by HTML & CSS, then start adding your JavaScript at this stage. JavaScript is often necessary for creating accessible custom UIs. So absolutely use when necessary to progressively enhance the experience of your users based on the user-agents they have.
SEE! I told you to add JavaScript! So no, progressive enhancement is not about being anti-JavaScript. It's about progressively adding JavaScript wherever necessary to enhance the features of the website, without blocking the basic content, layout and interactions for non-JavaScript users.
Well, why should I not write everything in JavaScript?
I know it's trendy these days to learn fancy new JavaScript frameworks and write fancy new interactive websites. So many of you at this point must be like, "Why won't we write everything in JavaScript? Maybe you hate JavaScript, that's why you are talking about these random HTML & CSS things. What are those? Is HTML even a programming language?"
Well firstly, I love JavaScript. I have contributed to many JavaScript projects, including jQuery. So no I don't hate JavaScript. But I love to use JavaScript for what JavaScript is supposed to be used for. And in most cases, layouting or loading basic content isn't one of them.
But who are these people who need websites to work without JavaScript?
People who have devices with only older browsers. Remember, buying a new device isn't so easy in every part of the world and sometimes some devices may have user-agents that don't support fancy JavaScript. But they still have the right to read the content of the website.
People who care about their security and privacy. A lot of security and privacy focused people prefer using a browser like Tor Browser with JavaScript disabled to avoid any kind of malicious JavaScript or JavaScript based tracking. Some users even use extensions like NoScript with common browsers (firefox, chrome, etc.) for similar reasons. But just because they care about their security and privacy doesn't mean they shouldn't have access to wesite content.
People with not so great internet. Many parts of the world still don't have access to great internet and rely on 2G connections. Often loading a huge bundled JavaScript framework with all it's sparkles and features takes unrealistically long time. But they should still be able to access content from a website article.
So, yes. It's not about not using JavaScript. It's more about starting without JavaScript, and then adding your bells and whistles with JavaScript. That way people who don't use JavaScript can still access atleast the basic content.
select_for_update is the answer if you want to acquire a lock on the row. The lock is only released after the transaction is completed. This is similar to the Select for update statement in the SQL query.
>>> Dealership.objects.select_for_update().get(pk='iamid')
>>> # Here lock is only required on Dealership object
>>> Dealership.objects.select_related('oem').select_for_update(of=('self',))
select_for_update have these four arguments with these default value
– nowait=False
– skiplocked=False
– of=()
– nokey=False
Let's see what these all arguments mean
nowait
Think of the scenario where the lock is already acquired by another query, in this case, you want your query to wait or raise an error, This behavior can be controlled by nowait, If nowait=True we will raise the DatabaseError otherwise it will wait for the lock to be released.
skip_locked
As somewhat name implies, it helps to decide whether to consider a locked row in the evaluated query. If the skip_locked=true locked rows will not be considered.
nowait and skip_locked are mutually exclusive using both together will raise ValueError
of
In select_for_update when the query is evaluated, the lock is also acquired on the select related rows as in the query. If one doesn't wish the same, they can use of where they can specify fields to acquire a lock on
>>> Dealership.objects.select_related('oem').select_for_update(of=('self',))
# Just be sure we don't have any nullable relation with OEM
no_key
This helps you to create a weak lock. This means the other query can create new rows which refer to the locked rows (any reference relationship).
Few more important points to keep in mind select_for_update doesn't allow nullable relations, you have to explicitly exclude these nullable conditions. In auto-commit mode, select_for_update fails with error TransactionManagementError you have to add code in a transaction explicitly. I have struggled around these points :).
Here is all about select_for_update which you require to know to use in your code and to do changes to your database.
Anyone who has dealt with <form> tag in HTML might have come across the autocomplete attribute. Most developers just put autocomplete="on" or autocomplete="off" based on whether they want users to be able to autocomplete the form fields or not. But there's much more in the autocomplete attribute than many folks may know.
Browser settings
Most widely used browsers (Firefox, Chrome, Safari, etc.), by default, remember information that is submitted using a form. When the user later tries to fill another form, browsers look at the name or type attribute of the form field, and then offer to autocomplete or autofill based on the saved information from previous form submissions. I am assuming many of you might have experienced these autocompletion suggestions while filling up forms. Some browsers, like Firefox, look at the id attribute and sometimes even the value of the <label> associated with the input field.
Autofill detail tokens
For a long time, the only valid values for the autocomplete attribute were "on" or "off" based on whether the website developer wanted to allow the browser to automatically complete the input. However, in the case of "on", it was left entirely to the browser how they determine which value is expected by the input form. Now, for some time, the autocomplete tag allows for some other values, which are collectively as a group called autofill detail tokens.
These values help tell the browser exactly what the input field expects without needing the browser to guess it. There is a big list of autofill detail tokens. Some of the common ones are "name", "email", "username", "organizations", "country", "cc-number", and so on. Check the WHATWG Standard for autofill detail tokens to understand what are the valid values, and how they are determined.
There are two different autofill detail tokens associated with passwords which have some interesting features apart from the autocompletion:
"new-password" - This is supposed to be used for "new password field"s or for "confirm new password field"s. This helps separate a current password field from a new password field. Most browsers and most password managers, when they see this in autocomplete attribute, will avoid accidentaly filling existing passwords. Some even suggest a new randomly generated password for the field if autocomplete has "new-password" value.
"current-password" - This is used by browsers and password managers to autofill or suggest autocompletion with the current saved password for that email/username for that website.
The above two tokens really help in intentionally separating new password fields from login password fields. Otherwise, the browsers and password managers don't have much to separate between the two different fields and may guess it wrong.
Privacy concerns
Now, all of the above points might already be giving privacy and security nightmares to many of you. Firstly, the above scenario works only if you are on the same computer, using the same accounts, and the same browsers. But there are a few things you can do to avoid autocompletion, or saving of the data when filling up the form.
Use the browser in privacy/incognito mode. Most browsers will not save the form data submitted to a website when opened in incognito mode. They will, however still suggest autocompletion, based on the saved information from normal mode.
If you already have autocomplete information saved from before, but want to remove now, you can. Most browsers allow you to clear form and search history from the browser.
If you want to disable autofill and autocomplete, you can do that as well from browser settings. This will also tell the browsers to never remember the values entered into the form fields.
You can find the related information for different browsers here:
Safari: You need to uncheck the different options mentioned here to disable
Now, if you are a privacy-focused developer like me, you might be wondering, "Can't I as a developer help protect privacy?". Yes, we can! That's exactly what autocomplete="off" is still there for. We can still add that attribute to an entire <form> which will disable both remembering and autocompletion of all form data in that form. We can also add autocomplete="off" individually to specific <input>, <textarea>, <select> to disabled the remembering and autocompletion of specific fields instead of the entire form.
PS: Even with autocomplete="off", most browsers still offer to remember username and password. This is actually done for the same reason digital security trainers ask to use password managers: so that users don't use the same simple passwords everywhere because they have to remember. As a digital security trainer, I would still recommend not using your browser's save password feature, and instead using a password manager. The password managers actually follow the same rule of remembering and auto-filling even with autocomplete="off" for username and password fields.
Accessibility
So, as a privacy-focused developer, you might be wondering, "Well, I should just use autocomplete="off" in every <form> I write from today". Well, that raises some huge accessibility concerns. If you love standards, then look specifically at Understanding Success Criterion 1.3.5: Identify Input Purpose.
There are folks with different disabilities who really benefit from the autocomplete tag, which makes it super important for accessibility:
People with disabilities related to memory, language or decision-making benefit immensely from the auto-filling of data and not need to remember every time to fill up a form
People with disabilities who prefer images/icons for communication can use assistive technology to add icons associated with various different input fields. A lot of them can benefit from proper autocomplete values, if the name attribute is not eligible.
People with motor disabilities benefit from not needing to manually input forms every time.
So, given that almost all browsers have settings to disable these features, it might be okay to not always use autocomplete="off". But, if there are fields that are essentially super sensitive, that you would never want the browser to save information about (e.g., government id, one time pin, credit card security code, etc.), you should use autocomplete="off" on the individual fields instead of the entire <form>. Even if you really really think that the entire form is super sensitive and you need to apply autocomplete="off" on the entire <form> element to protect your user's privacy, you should still at least use autofill detail tokens for the individual fields. This will ensure that the browser doesn't remember the data entered or suggest autofills, but will still help assistive technologies to programmatically determine the purpose of the fields.
A network is a group of computers and computing devices connected together through communication channels, such as cables or wireless media. The computers connected over a network may be located in the same geographical area or spread across the world. The Internet is the largest network in the world and can be called "the network of networks".
ip address
Devices attached to a network must have at least one unique network address identifier known as the IP (Internet Protocol) address. The address is essential for routing packets of information through the network.
Exchanging information across the network requires using streams of small packets, each of which contains a piece of the information going from one machine to another. These packets contain data buffers, together with headers which contain information about where the packet is going to and coming from, and where it fits in the sequence of packets that constitute the stream. Networking protocols and software are rather complicated due to the diversity of machines and operating systems they must deal with, as well as the fact that even very old standards must be supported.
IPV4 and IPV6
There are two different types of IP addresses available: IPv4 (version 4) and IPv6 (version 6). IPv4 is older and by far the more widely used, while IPv6 is newer and is designed to get past limitations inherent in the older standard and furnish many more possible addresses.
IPv4 uses 32-bits for addresses; there are only 4.3 billion unique addresses available. Furthermore, many addresses are allotted and reserved, but not actually used. IPv4 is considered inadequate for meeting future needs because the number of devices available on the global network has increased enormously in recent years.
IPv6 uses 128-bits for addresses; this allows for 3.4 X 1038 unique addresses. If you have a larger network of computers and want to add more, you may want to move to IPv6, because it provides more unique addresses. However, it can be complex to migrate to IPv6; the two protocols do not always inter-operate well. Thus, moving equipment and addresses to IPv6 requires significant effort and has not been quite as fast as was originally intended. We will discuss IPv4 more than IPv6 as you are more likely to deal with it.
One reason IPv4 has not disappeared is there are ways to effectively make many more addresses available by methods such as NAT (Network Address Translation). NAT enables sharing one IP address among many locally connected computers, each of which has a unique address only seen on the local network. While this is used in organizational settings, it also used in simple home networks. For example, if you have a router hooked up to your Internet Provider (such as a cable system) it gives you one externally visible address, but issues each device in your home an individual local address.
decoding IPv4
A 32-bit IPv4 address is divided into four 8-bit sections called octets.
Example:
IP address → 172 . 16 . 31 . 46
Bit format → 10101100.00010000.00011111.00101110
NOTE: Octet is just another word for byte.
Network addresses are divided into five classes: A, B, C, D and E. Classes A, B and C are classified into two parts: Network addresses (Net ID) and Host address (Host ID). The Net ID is used to identify the network, while the Host ID is used to identify a host in the network. Class D is used for special multicast applications (information is broadcast to multiple computers simultaneously) and Class E is reserved for future use.
Class A network address – Class A addresses use the first octet of an IP address as their Net ID and use the other three octets as the Host ID. The first bit of the first octet is always set to zero. So you can use only 7-bits for unique network numbers. As a result, there are a maximum of 126 Class A networks available (the addresses 0000000 and 1111111 are reserved). Not surprisingly, this was only feasible when there were very few unique networks with large numbers of hosts. As the use of the Internet expanded, Classes B and C were added in order to accommodate the growing demand for independent networks.
Each Class A network can have up to 16.7 million unique hosts on its network. The range of host address is from 1.0.0.0 to 127.255.255.255.
Class b network address – Class B addresses use the first two octets of the IP address as their Net ID and the last two octets as the Host ID. The first two bits of the first octet are always set to binary 10, so there are a maximum of 16,384 (14-bits) Class B networks. The first octet of a Class B address has values from 128 to 191. The introduction of Class B networks expanded the number of networks but it soon became clear that a further level would be needed.
Each Class B network can support a maximum of 65,536 unique hosts on its network. The range of host addresses is from 128.0.0.0 to 191.255.255.255.
Class C network address – Class C addresses use the first three octets of the IP address as their Net ID and the last octet as their Host ID. The first three bits of the first octet are set to binary 110, so almost 2.1 million (21-bits) Class C networks are available. The first octet of a Class C address has values from 192 to 223. These are most common for smaller networks which don’t have many unique hosts.
Each Class C network can support up to 256 (8-bits) unique hosts. The range of host addresses is from 192.0.0.0 to 223.255.255.255.
what is a name resolution?
Name Resolution is used to convert numerical IP address values into a human-readable format known as the hostname. For example, 104.95.85.15 is the numerical IP address that refers to the hostname whitehouse.gov. Hostnames are much easier to remember!
Given an IP address, one can obtain its corresponding hostname. Accessing the machine over the network becomes easier when one can type the hostname instead of the IP address.
Then comes the network configuration files and these are essential to ensure that interfaces function correctly. They are located in the /etc directory tree. However, the exact files used have historically been dependent on the particular Linux distribution and version being used.
For Debian family configurations, the basic network configuration files could be found under /etc/network/, while for Red Hat and SUSE family systems one needed to inspect /etc/sysconfig/network.
Network interfaces are a connection channel between a device and a network. Physically, network interfaces can proceed through a network interface card (NIC), or can be more abstractly implemented as software. You can have multiple network interfaces operating at once. Specific interfaces can be brought up (activated) or brought down (de-activated) at any time.
A network requires the connection of many nodes. Data moves from source to destination by passing through a series of routers and potentially across multiple networks. Servers maintain routing tables containing the addresses of each node in the network. The IP routing protocols enable routers to build up a forwarding table that correlates final destinations with the next hop addresses.
Let’s learn about more networking tools like wget and curl.
Sometimes, you need to download files and information, but a browser is not the best choice, either because you want to download multiple files and/or directories, or you want to perform the action from a command line or a script. wget is a command line utility that can capably handle these kinds of downloads. Whereas curl is used to obtain information about any specific URL.
File Transfer Protocol(FTP)
File Transfer Protocol (FTP) is a well-known and popular method for transferring files between computers using the Internet. This method is built on a client-server model. FTP can be used within a browser or with stand-alone client programs. FTP is one of the oldest methods of network data transfer, dating back to the early 1970s.
Secure Shell(SSH)
Secure Shell (SSH) is a cryptographic network protocol used for secure data communication. It is also used for remote services and other secure services between two devices on the network and is very useful for administering systems which are not easily available to physically work on, but to which you have remote access.
There are few command line tools which we can use to parse text files. This helps us in day to day basis if you are using linux and it becomes essential for a Linux user to adept at performing certain operations on file.
cat
cat is short for concatenate and is one of the most frequently used Linux command line utilities. It is often used to read and print files, as well as for simply viewing file contents. To view a file, use the following command:
$ cat
For example, cat readme.txt will display the contents of readme.txt on the terminal. However, the main purpose of cat is often to combine (concatenate) multiple files together. The tac command (cat spelled backwards) prints the lines of a file in reverse order. Each line remains the same, but the order of lines is inverted.
cat can be used to read from standard input (such as the terminal window) if no files are specified. You can use the > operator to create and add lines into a new file, and the >> operator to append lines (or files) to an existing file. We mentioned this when talking about how to create files without an editor.
To create a new file, at the command prompt type cat > and press the Enter key. This command creates a new file and waits for the user to edit/enter the text, after editing type in CTRL-D at the beginning of the next line to save and exit the editing.
echo
echo simply displays (echoes) text. It is used simply, as in:
$ echo string
echo can be used to display a string on standard output (i.e. the terminal) or to place in a new file (using the > operator) or append to an already existing file (using the >> operator).
The –e option, along with the following switches, is used to enable special character sequences, such as the newline character or horizontal tab:
\n represents newline
\t represents horizontal tab.
echo is particularly useful for viewing the values of environment variables (built-in shell variables). For example, echo $USERNAME will print the name of the user who has logged into the current terminal.
how to work with large files?
System administrators need to work with configuration files, text files, documentation files, and log files. Some of these files may be large or become quite large as they accumulate data with time. These files will require both viewing and administrative updating.
For example, a banking system might maintain one simple large log file to record details of all of one day’s ATM transactions. Due to a security attack or a malfunction, the administrator might be forced to check for some data by navigating within the file. In such cases, directly opening the file in an editor will cause issues, due to high memory utilization, as an editor will usually try to read the whole file into memory first. However, one can use less to view the contents of such a large file, scrolling up and down page by page, without the system having to place the entire file in memory before starting. This is much faster than using a text editor.
head reads the first few lines of each named file (10 by default) and displays it on standard output. You can give a different number of lines in an option.
For example, if you want to print the first 5 lines from /etc/default/grub, use the following command:
$ head –n 5 /etc/default/grub
tail prints the last few lines of each named file and displays it on standard output. By default, it displays the last 10 lines. You can give a different number of lines as an option. tail is especially useful when you are troubleshooting any issue using log files, as one can probably want to see the most recent lines of output.
For example, to display the last 15 lines of somefile.log, use the following command:
$ tail -n 15 somefile.log
to view compressed files
When working with compressed files, many standard commands cannot be used directly. For many commonly-used file and text manipulation programs, there is also a version especially designed to work directly with compressed files. These associated utilities have the letter "z" prefixed to their name. For example, we have utility programs such as zcat, zless, zdiff and zgrep.
managing your files
Linux provides numerous file manipulation utilities that you can use while working with text files.
sort – is used to rearrange the lines of a text file, in either ascending or descending order according to a sort key. The default sort key is the order of the ASCII characters (i.e. essentially alphabetically).
uniq – removes duplicate consecutive lines in a text file and is useful for simplifying the text display.
paste – can be used to create a single file containing all three columns. The different columns are identified based on delimiters (spacing used to separate two fields). For example, delimiters can be a blank space, a tab, or an Enter.
split – is used to break up (or split) a file into equal-sized segments for easier viewing and manipulation, and is generally used only on relatively large files. By default, split breaks up a file into 1000-line segments. The original file remains unchanged, and a set of new files with the same name plus an added prefix is created. By default, the x prefix is added. To split a file into segments, use the command split infile.
Regular expressions are text strings used for matching a specific pattern, or to search for a specific location, such as the start or end of a line or a word. Regular expressions can contain both normal characters or so-called meta-characters, such as * and $.
grep is extensively used as a primary text searching tool. It scans files for specified patterns and can be used with regular expressions, as well as simple strings
Previously, I wrote about the first revision of our RepRap machine based on Prusa i3 printer. This is a project which I have been working with my younger brother. I will be talking about the enhancements, issues, and learnings from the second build of the printer.
3D printed printer parts As soon as we got the first build of the printer working, we started printing printer parts. Basically, the idea is to replace the wooden parts with 3D printed parts which have way better precision.
It has been ridiculously hard and I was about to pull my hair out at several
points on this ordeal, but I finally managed to remove this stupid (revised to
modest language here) DRM from my epub file. I took many side roads that lead me
nowhere and spent quite some hours to find a working solution, so I'm writing
this down in case I (or someone else) may need it again.
Why removing DRM from an ebook
First of all: I'm not saying you should do anything illegal (I'm also not saying
that you shouldn't); however, a friend bought me an ebook as a gift. I told him
about this book and that it was on top of my to-read pile, I just didn't buy it
yet. So he purchased and gifted to me a digital copy and asked if I would lend
it to him when done reading. Sure thing. Except that I couldn't. Bummer!
What I downloaded from the store was an acsm (for Adobe Content Server Message)
file, which is needed to communicate with the Adobe Servers. It verifies that
I'm authorized to download the (DRM protected) book, and to view the content on
different devices that also use my Adobe ID.
A quick internet search...
revealed the following path that lay before me:
install and authorize Adobe Digital Editions
load the acsm in ADE to download the book
export the book as DRM protected epub
use calibre and the DeDRM plugin to remove the DRM protection
So far that sounded perfectly doable. Then I learned that Adobe Digital Editions
is available for Windows, MacOS, Android ... of course there's no Linux app. I
don't have a Windows or Apple machine, and I hate fiddling around with the
phone, so wine it is.
However, on my main machine I also don't have wine - or rather I don't want to,
because I hate to activate the multilib repositories. Wine is still all lib32.
So I used another Ubuntu laptop that I keep for the dirty work of that kind.
Unfortunately (for my mental health) my Ubuntu distribution offers a calibre
version 4.9x, while the latest DeDRM plugin requires calibre 5.x. Fine!, I
think, I just use the latest DeDRM release that plays along with my calibre 4.9.
Well, of course not! That would require calibre running on Python 2; mine runs on
Python 3. There, I loose my hair.
So I end up installing the latest version from
calibre-ebook.com. Should you have to do this, make sure to pick
the "isolated" install, that does not require root.
This worked for me
We start with a fresh wine prefix to install our ADE:
Evoking winecfg will initialize the new prefix. Make sure to pick Windows 10
here - I'll tell you why in a bit.
When I started ADE and tried to open the acsm directly, it crashed. So I had to
first manually authorize the laptop via the Help menu. An Adobe ID is required,
create one if you don't have it already. Then we can load the acsm and get the
DRM protected ebook. I could find the epub files in a new ADE sub-folder in my
~/Documents, but you can also save them to a different location in ADE.
To use calibre's DeDRM plugin to remove the protection,
we need to extract the Adobe ID key from ADE, so make sure to have both
installed. The DeDRM plugin is nice enough to offer a field for our wine prefix.
However, this also means that a wine Python is going to be necessary. This is
what you need to pick Windows 10 for in winecfg: Python 3.9 or 3.10 can easily
be downloaded from https://www.python.org/, but installers only
work on Windows 8.1 or higher (and nobody wants to use 8.1). Remember to pick
the 32 bit version; it's still meant for the wine environment.
So we install Python 3.10 and the necessary dependencies to make the scripts work:
Set the check on "Add Python to Path" during the installation of Python, to
spare you some headaches
The scripts also require an OpenSSL distribution in your wine environment (I
failed to install PyCrypto, the other viable option. Don't ask me why, because I
don't know.). I did find a working package here. Make sure to
pick version 1.1; this is what the DeDRM scripts use (and again choose the 32
bit variant, of course).
Having this set up, you should be able to add your Adobe ID key to the DeDRM
plugin in the plugin's preferences dialog.
I actually didn't know that I would need all this, but after enough cursing I
finally remembered to start calibre with calibre-debug -g, to actually learn
why that stupid (kidding, it's great) script failed. Actually in the end I just
located the python script to extract the ADE keys and ran it manually:
cd ~/.config/calibre/plugins/DeDRM/libraryfiles/
wine python adobekey.py
And this is where you have deserved a bottle of well chilled beer. Whether you
managed to import the key to the DeDRM plugin directly, or manually extracted
them with running adobekey.py for later file import; Now your plugin should be
armed and ready to have your book(s) added to calibre, and DRM protection should
be removed on import.
I also tried...
a tool called knock that promised to remove DRM protection, with
neither wine nor Adobe Digital Editions (ADE) required. I saw very positive
comments so I assume it can work somehow, but apparently I was too dumb to use
it. It probably would have been easier to install via nix, but that's
completely unknown territory for me, so I tried to install all dependencies
manually in a virtual environment, and as with many click apps, it was a pain
to use it in any way that differs the original intention.
The binary packaged release also failed for me, but maybe you are smarter than I
was.
EmacsConf 2021 happened in November last year. Same as last two years, it was an online conference. Thanks to all the volunteers and organizers, it was a great experience.
EmacsConf is the conference about the joy of Emacs, Emacs Lisp, and memorizing key sequences.
— EmacsConf website.
It was a 2 days conference with 45 talks in total. Despite being a thanksgiving weekend, the peak count of attendees was around 300.
I want to spend more time in 2022 to “Do Less!”, take things as they come, take breaks and try to travel or work on mechanical keyboards during those breaks. The time off that I took in 2021 - first half of August and last half of December, gave me a lot of breathing space and time to rethink priorities especially the first one where I partially overcame burnout due to several factors (Thanks to VMware for the generous leaves!).
Last year, I started actively taking care of my health. I jogged for ~700kms in the last quarter (Oct-Dec) although I did not jog during the time I was travelling to Bangalore/Delhi/Kolkata. I want to continue the same trend and target atleast 3000kms of jogging and a 10km run in 2022.
2021 was also when I moved back to my hometown, Agartala, and that too after a span of 9 years. I spent a lot of time with close family members and friends from school. I plan to spend more time with people who I care about and who care about me, be it in Bangalore or Agartala.
Perfectly timed! I had a Raspberry Pi 4 lying around and had just ordered a few more to set up a home lab during the holidays. The newer Pis are yet to arrive, so better utilize the time writing a walkthrough on how to use Flatcar Container Linux on your Pis.
Hardware Requirements
Goes without saying, a Raspberry Pi 4
Form of storage, either USB and/or SD card. USB 3.0 drive recommended because of the much better performance for the price.
⚠️ WARNING ⚠️
The UEFI firmware used in this guide is an UNOFFICIAL firmware. There is a possibility of damage caused due to the usage of this firmware.
The author of this article would not be liable for any damage caused. Please follow this article at your own risk.
Update the EEPROM
The Raspberry PI 4 use an EEPROM to boot the system. Before proceeding ahead, it is recommended to update the EEPROM. Raspberry Pi OS automatically updates the bootloader on system boot. In case you are using Raspberry Pi OS already, then the bootloader may be already updated.
For manually updating the EEPROM, you can either use the Raspberry Pi Imager or the raspi-config. The former is the recommended method in the Raspberry Pi documentation.
We will also see later how the RPi4 UEFI firmware needs a recent version of EEPROM.
Using the Raspberry Pi Imager (Recommended)
Install the Raspberry Pi Imager software. You can also look for the software in your distribution repository.
Being a Fedora user I installed the software using dnf
dnf install rpi-imager
Launch Raspberry Pi Imager.
Select Misc utility images under Operating System.
Select Bootloader.
Select the boot-mode, SD, USB
Select the appropriate storage, SD or USB
Boot the Raspberry Pi with the new image and wait for at least 10 seconds.
The green activity LED will blink with a steady pattern and the HDMI display will be green on success.
Power off the Raspberry Pi and disconnect the storage.
# The update is pulled from the `default` release channel.# The other available channels are: latest and beta# You can update the channel by updating the value of# `FIRMWARE_RELEASE_STATUS` in the `/etc/default/rpi-eeprom-update`# file. This is useful usually in case when you want# features yet to be made available on the default channel.# Install the update
sudo rpi-eeprom-update -a
# A reboot is needed to apply the update# To cancel the update, you can use: sudo rpi-eeprom-update -r
sudo reboot
Installing Flatcar
Install flatcar-install script
Flatcar provides a simple installer script that helps install Flatcar Container Linux on the target disk. The script is available on Github, and the first step would be to install the script in the host system.
mkdir -p ~/.local/bin
# You may also add `PATH` export to your shell profile, i.e bashrc, zshrc etc.
export PATH=$PATH:$HOME/.local/bin
curl -LO https://raw.githubusercontent.com/flatcar-linux/init/flatcar-master/bin/flatcar-install
chmod +x flatcar-install
mv flatcar-install ~/.local/bin
Install Flatcar on the target device
Now that we have the flatcar-install installed in our host machine. We would go ahead and install the Flatcar Container Linux image on the target device.
The target device could be a USB or SD Card. In my case, I reused the existing SD Card which I used in the previous steps. You can use a separate storage device as well.
The options that we will be using with the scripts are:
# -d DEVICE Install Flatcar Container Linux to the given device.# -C CHANNEL Release channel to use# -B BOARD Flatcar Container Linux Board to use# -o OEM OEM type to install (e.g. ami), using flatcar_production_<OEM>_image.bin.bz2# -i IGNITION Insert an Ignition config to be executed on boot.
The device would be the target device that you would like to use. You can use the lsblk command to find the appropriate disk. Here, I’m using /dev/sda which was in my case.
With the given values of channel and board, the script would download the image, verify it with gpg, and then copy it bit for bit to disk.
In our case, Flatcar does not yet ship Raspberry PI specific OEM images yet so the value will be an empty string ''.
Pass the Ignition file, config.json in my case, to provision the Pi during boot.
rpi-uefi community ships a SBBR-compliant(UEFI+ACPI), ArmServerReady ARM64 firmware for Raspberry Pi 4. We would be using the same to UEFI boot Flatcar.
v1.17 of the pftf/RPi4 introduced two major changes:
Firstly, it enabled firmware boot directly from the USB. This is particularly helpful if you are using the installation process using a USB device. To add a fun story, I dropped my Pi and broke the SD card slot. Until the Pi gets repaired, I’m making use of direct USB boot 😎
Secondly, support for directly placing the Pi boot files into the EFI System Partition (ESP). This feature was not implemented in the firmware, rather from the upstream firmware from Raspberry Pi Foundation. This is why it is recommended to update the Pi EEPROM at the very beginning.
Let’s move ahead with the final steps.
Place the UEFI firmware into the EFI System Partition.
Today, we are going to see how we can use | operator in our python code to achieve clean code.
Here is the code where we have used map and filter for a specific operation.
In [1]: arr = [11, 12, 14, 15, 18]
In [2]: list(map(lambda x: x * 2, filter(lambda x: x%2 ==1, arr)))
Out[2]: [22, 30]
The same code with Pipes.
In [1]: from pipe import select, where
In [2]: arr = [11, 12, 14, 15, 18]
In [3]: list(arr | where (lambda x: x%2 ==1) | select(lambda x:x *2))
Out[3]: [22, 30]
Pipes passes the result of one function to another function, have inbuilt pipes method like select, where, tee, traverse.
Install Pipe
>> pip install pipe
traverse
Recursively unfold iterable:
In [12]: arr = [[1,2,3], [3,4,[56]]]
In [13]: list(arr | traverse)
Out[13]: [1, 2, 3, 3, 4, 56]
select()
An alias for map().
In [1]: arr = [11, 12, 14, 15, 18]
In [2]: list(filter(lambda x: x%2 ==1, arr))
Out[2]: [11, 15]
where()
Only yields the matching items of the given iterable:
In [1]: arr = [11, 12, 14, 15, 18]
In [2]: list(arr | where(lambda x: x % 2 == 0))
Out[2]: [12, 14, 18]
sort()
Like Python's built-in “sorted” primitive. Allows cmp (Python 2.x
only), key, and reverse arguments. By default, sorts using the
identity function as the key.
That's all for today, In this blog you have seen how to install the Pipe and use the Pipe to write clean and short code using inbuilt pipes, you can check more over here
I have had the pleasure to talk with 30+ folks and help them in their journey in the field of computer science and/or growing in their career with Open Source Software. It has been an honour that so many wanted to talk to me and get my views.
For the month of December, I am going to take a break from the mentoring sessions as I will travelling on most weekends and will be out for vacation in the later half of the month.
Fret not, I will try to make up for the lost time by doubling up my commitment for January 2022. But, in case you need to urgently talk with me, drop me a ping on hey [at] nabarun [dot] dev and I will try to schedule something which works for both of us.
Wish you all a very happy December! 🎉
PS: Stay tuned to the RSS feed! There are many articles which are languishing in my drafts, I may publish a few of them.
Martin M. Broadwell defines four stages of competence in Teaching for Learning:
unconcious imcompetence
concious incompetence
concious competence
unconcious competence.
Specifically, unconcious incompetence means you are unable to perform a task correctly and are unaware of the gap.
Concious incompetence means you are unable to perform a task correctly but are aware of the gap.
Concious competence means you are capable of performing a task with effort.
Finally, unconcious competence means you are capable of performing a task effortlessly.
All engineers start out conciously or unconciously incompetent. Even if you know everything about software engineering (an impossible task), you’re going to have to learn practical skills like those covered in this book. Your goal is to get to concious competence as quickly as possible.
Cunningham’s Law And Bike-Shedding
We advise you to document, conventions, onboarding procedures, and other oral traditions on your team. You will get a lot of comments and corrections. Do not take the comments Personally.
The point is not to write a perfect document but rather to write enough to trigger a discussion that flashes out the details. This is a variation of Cunningham’s law, which states that “the best way to get the right answer on the internt is not to ask a question; it’s to post the wrong answer.”
Be prepared for trivial discussions to become drawn out, a phenomenon called bike-shedding. Bike-shedding is an allegory by C. Northcote Parkinson, describing a committee assigned to review designs for a power plant. The committee approves the plans within minutes, as they are too complex to actually discuss. They then spend 45 minutes discussing the materials for the bike shed next to the plant. Bike-shedding comes up a lot in technical work.
Mistakes are unavoidable (Learn by Doing!)
At one of Chris’s first internships, he was working on a project with a senior engineer. Chris finished some changes and needed to get them deployed. The senior engineer showed him how to check code into the revision control system, CVS. Chris followed the instructions, blindly running through steps that involved branching, tagging, and merging. Afterward, he continued with the rest of his day and went home.
The next morning, Chris strolled in cheerfully and greeted everyone. They did their best to respond in kind, but they were low. When Chris asked what was up, they informed him that he had managed to corrupt the entire CVS repository. All of the company’s code had been lost. They had been up the entire night desperately trying to recover what they could and were eventually able to get most of the code back (except for Chris’s commits and a few others).
Chris was pretty shaken by the whole thing. His manager pulled him aside and told him not to worry: Chris had done the right thing working with the senior engineer.
Mistakes happen. Every engineer has some version of a story like this. Do your best, and try to understand what you’re doing, but know that these things happen.
One of these posts; not technical, written with a hot head, controversial topics. Don't read if that already annoys you.
The argument
I've had a hot discussion yesterday, and I'm not quite happy how it went. It's not important how we came there but the core argument was about why I think that less to no money should be spend on military and arms, while the same money should be spent on peace studies and education of diplomats - which I refer to as people who try to understand a different culture and initiate an exchange of values and ethics - not eloquent deceitful but open and direct, fair and square.
My opposition's opinion was different, claiming that we would never live in peace without an army enforcing the peace. That's an opinion that I could argue about all day long, but then he said what really is upsetting me; It's human nature to have wars and to only care for oneself.
I strongly disagree.
As for myself, I don't want that. Does this make me not a human? When we hear, read or see in the news what war crimes are committed, how people are forced to live (if they may live), what people are capable of doing to each other, we call this inhuman. What do we actually mean when we say so? Are the people committing these crimes not human people, or is it the act that is inhuman?
Again, my opponent argues that you should not call someone a liar, but rather say you lied - so to condemn the act of lying but not the person. Well, do we do that in other situations? May we call someone a murderer or a thief when they kill people or steal things? Is it a matter of frequency how often I eat meat while still calling myself a vegetarian?
Being human
The definition of what is or is not human may differ vastly. However, it is schizophenic that if we agree something is inhuman, to not take the consequences; As a human being stop acting inhuman!
Like being a vegetarian or not being a liar and murderer, for me being human is a continuous process that demands continuous work on ourselves. I have to work on myself to act how I want a human to act. When I think that the way meat is "produced" today on earth is inhuman, then I will have to change my diet. I have a vote, I can choose what to buy (or not to buy) or what to write on my blog. I can show people that I disagree and I can sit together with them, discuss differences of opinions and find a rational consent, because that's what I think how a human would act.
On a fun side note, my opponent also argued that animals also fight themselves, so this is just natural. Well, be an animal then.
This is the first post on my blog in a while. I guess this is coming after almost 2 years 9 months. Yes, never wrote those end-of-year review posts too.
I've been playing around with containers for a few years now. I find them very useful.
If you host your own, like I do, you probably write a lot of nginx configurations, maybe apache.
If that's the case, then you have your own solution to get certificates.
I'm also assuming that you are using let's encrypt with certbot or something.
Well, I didn't want to anymore. It was time to consolidate. Here comes Traefik.
an open-source Edge Router that makes publishing your services a fun and easy experience. It receives requests on behalf of your system and finds out which components are responsible for handling them.
Which made me realize, I still need nginx somewhere. We'll see when we get to it. Let's focus on Traefik.
Configuration
If you run a lot of containers and manage them, then you probably use docker-compose.
I'm still using version 2.3, I know I am due to an upgrade but I'm working on it slowly.
It's a bigger project… One step at a time.
Let's start from the top, literally.
---
version: '2.3'
services:
Note
Upgrading to version 3.x of docker-compose requires the creation of network to link containers together. It's worth investing into, this is not a docker-compose tutorial.
Let's Encrypt have set limits on how many certificates you can request per certain amount of time. To test your certificate request and renewal processes, use their staging infrastructure. It is made for such purpose.
Then we mount it, for persistence.
- "./traefik/acme.json:/acme.json"
Let's not forget to add our CloudflareAPI credentials as environment variables for Traefik to use.
With a little bit of Traefik documentation searching and a lot of help from htpasswd, we can create a basicauth login to protect the dashboard from public use.
[engine x] is an HTTP and reverse proxy server, a mail proxy server, and a generic TCP/UDP proxy server, originally written by Igor Sysoev.
In this example, we're going to assume you have a static blog generated by a static blog generator of your choice and you would like to serve it for people to read it.
So let's do this quickly as there isn't much to tell except when it comes to labels.
We are mounting the blog directory from our host to /usr/share/nginx/html/blog as read-only into the nginx container. We are also providing nginx with a template configuration and passing the variables as environment variables as you noticed. It is also mounted as read-only. The configuration template looks like the following, if you're wondering.
server {
listen ${NGINX_BLOG_PORT};
server_name localhost;
root /usr/share/nginx/html/${NGINX_BLOG_HOST};
location / {
index index.html;
try_files $uri $uri/ =404;
}
}
Traefik configuration
So, Traefik configuration at this point is a little bit tricky for the first time.
As my followers well know, by now, I am a tinkerer at heart. Why do I do things ? No one knows ! I don't even know.
All I know, all I can tell you is that I like to see what can I do with the tools I have at hand. How can I bend them to my will.
Why, you may ask. The answer is a bit complicated; part of who I am, part of what I do as a DevOps. End line is, this time I was curious.
I went down a road that taught me so much more about containers, docker, docker-compose and even Linux itself.
The question I had was simple, can I run a container only through Tor running in another container?
Tor
I usually like to start topics that I haven't mentioned before with definitions. In this case, what is Tor, you may ask ?
What is Tor?
Tor is free software and an open network that helps you defend against traffic analysis, a form of network surveillance that threatens personal freedom and privacy, confidential business activities and relationships, and state security.
Although that home page is obscure because it was replaced by the new design of the website.
Although I love what Tor has done with all the services they offer, don't get me wrong.
But giving so much importance on the browser only and leaving the rest for dead when it comes to website, I have to say, I'm a bit sad.
Anyway, let's share the love for Tor and thank them for the beautiful project they offered humanity.
Now that we thanked them, let's abuse it.
Tor in a container
The task I set to discover relied on Tor being containerized.
The first thing I do is, simply, not re-invent the wheel.
Let's find out if someone already took that task.
With a litte bit of search, I found the dperson/torproxy docker image.
It isn't ideal but I believe it is written to be rebuilt.
Can we run it ?
docker run -it -p 127.0.0.1:8118:8118 -d dperson/torproxy
curl -Lx http://localhost:8118 http://jsonip.com/
And this is definitely not your IP. Don't take my word for it!
Go to http://jsonip.com/ in a browser and see for yourself.
Now that we know we can run Tor in a container effectively, let's kick it up a notch.
docker-compose
I will be testing and making changes as I go along. For this reason, it's a good idea to use docker-compose to do this.
Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services. Then, with a single command, you create and start all the services from your configuration.
Now that we saw what the docker team has to say about docker-compose, let's go ahead and use it.
First, let's implement what we just ran ad-hoc in docker-compose.
Let's put it all together in a docker-compose.yaml file and run it.
docker-compose up -d
Keep that terminal open, and let's put the hypothesis to the test and see if rises up to be a theory.
docker exec air-gapped apt-get update
Aaaaand…
Err:1 http://archive.ubuntu.com/ubuntu focal InRelease
Temporary failure resolving 'archive.ubuntu.com'
Err:2 http://security.ubuntu.com/ubuntu focal-security InRelease
Temporary failure resolving 'security.ubuntu.com'
Err:3 http://archive.ubuntu.com/ubuntu focal-updates InRelease
Temporary failure resolving 'archive.ubuntu.com'
Err:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease
Temporary failure resolving 'archive.ubuntu.com'
Reading package lists...
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal/InRelease Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal-updates/InRelease Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal-backports/InRelease Temporary failure resolving 'archive.ubuntu.com'
W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/focal-security/InRelease Temporary failure resolving 'security.ubuntu.com'
W: Some index files failed to download. They have been ignored, or old ones used instead.
looks like it's real peeps, hooray !
Putting everything together
Okay, now let's put everything together. The list of changes we need to make are minimal.
First, I will list them, then I will simply write them out in docker-compose.
Create an internet network for the Tor container
Attach the internet network to the Tor container
Attach the no-internet network to the Tor container so that our air-gapped container can access it.
Chatted with a few people about the state of Python and pip on Debian.
General notes
Looking back, I think I picked up a couple of new projects based on
random brain waves I had! That’s perfectly timed, because I’ve decided
to pivot away from my earlier approach of “yay, more responsibility!”.
What next?
This is getting increasingly harder to decide on, as my free time
chunks are becoming smaller and I’m picking up bigger projects. :)
Technical
Sphinx Theme PEP 517 stuff: Make the initial release.
sphinx-basic-ng: Make the first usable release.
pip: Clear some of the backlog on the pull request front.
pip: More progress on the documentation rewrite.
Communication
Spend more time looking into the Python lockfile standardisation effort.
Write a blog post, on automated code formatting.
Find more speaking opportunities, to talk about things that aren’t Python packaging!
I presented 2 talks at FOSDEM: in the Python devroom and
Open Source Design devroom. Shout-out to Bernard Tyers, for all the help and
the bazillion reminders to make sure I do all the things on time. :)
Collaborating on designing a lockfile format for Python, that can hopefully
be standardised for interoperability.
General notes
Onboarding in a new company, relocating internationally, settling into a new
space has been… well, it’s all been a very interesting learning experience.
Given the fairly strict lockdown and the percentage of people wearing masks
in my locality, I’ve spent a lots of time indoors. Looking forward to the
social weekends experiment I’m doing.
What next?
Technical
pip: Work on the documentation rewrite, hopefully to get it ready in time for
the next release.
pip: Clear some of the backlog on the pull request front.
pip: General discussions for new features and enhancements.
TOML: Work on writing that the compliance test suite.
TOML: Bring toml for Python back from the dead.
Furo: Make the first stable release.
Start work on the other Sphinx theme I have in mind.
Communication
Spend more time looking into the Python lockfile standardisation effort.
Catch up on the Python-on-Debian saga, and see how I can contribute
constructively.
I recently read this paper titled, Understanding Real-World Concurrency Bugs in Go (PDF), that studies concurrency bugs in Golang and comments on the new primitives for messages passing that the language is often known for.
I am not a very good Go programmer, so this was an informative lesson in various ways to achieve concurrency and synchronization between different threads of execution. It is also a good read for experienced Go developers as it points out some important gotchas to look out for when writing Go code. The fact that it uses real world examples from well known projects like Docker, Kubernetes, gRPC-Go, CockroachDB, BoltDB etc. makes it even more fun to read!
The authors analyzed a total of 171 concurrency bugs from several prominent Go open source projects and categorized them in two orthogonal dimensions, one each for the cause of the bug and the behavior. The cause is split between two major schools of concurrency
Along the cause dimension, we categorize bugs into those that are caused by misuse of shared memory and those caused by misuse of message passing
and the behavior dimension is similarly split into
we separate bugs into those that involve (any number of ) goroutines that cannot proceed (we call themblocking bugs) and those that do not involve any blocking (non-blocking bugs)
Interestingly, they chose the behavior to be blocking instead of deadlock since the former implies that atleast one thread of execution is blocked due to some concurrency bug, but the rest of them might continue execution, so it is not a deadlock situation.
Go has primitive shared memory protection mechanisms like Mutex, RWMutex etc. with a caveat
Write lock requests in Go have ahigher privilege than read lock requests.
as compared to pthread in C. Go also has a new primitive called sync.Once that can be used to guarantee that a function is executed only once. This can be useful in situations where some callable is shared across multiple threads of execution but it shouldn't be called more than once. Go also hassync.WaitGroups , which is similar to pthread_join to wait for various threads of executioun to finish executing.
Go also uses channels for the message passing between different threads of executions called Goroutunes. Channels can be buffered on un-buffered (default), the difference between them being that in a buffered channel the sender and receiver don't block on each other (until the buffered channel is full).
The study of the usage patterns of these concurrency primitives in various code bases along with the occurence of bugs in the codebase concluded that even though message passing was used at fewer places, it accounted for a larger number of bugs(58%).
Implication 1:With heavier usages of goroutines and newtypes of concurrency primitives, Go programs may potentiallyintroduce more concurrency bugs
Also, interesting to note is this observation in tha paper
Observation 5:All blocking bugs caused by message passing are related to Go’s new message passing semantics like channel. They can be difficult to detect especially when message passing operations are used together with other synchronization mechanisms
The authors also talk about various ways in which Go runtime can detect some of these concurrency bugs. Go runtime includes a deadlock detector which can detect when there are no goroutunes running in a thread, although, it cannot detect all the blocking bugs that authors found by manual inspection.
For shared memory bugs, Go also includes a data race detector which can be enbaled by adding -race option when building the program. It can find races in memory/data shared between multiple threads of execution and uses happened-before algorithm underneath to track objects and their lifecycle. Although, it can only detect a part of the bugs discovered by the authors, the patterns and classification in the paper can be leveraged to improve the detection and build more sophisticated checkers.
TLDR; Trying to learn new things I tried writing a URL shortner called shorty. This is a first draft and I am trying to approach it from first principle basis. Trying to break down everything to the simplest component.
I decided to write my own URL shortner and the reason for doing that was to dive a little more into golang and to learn more about systems. I have planned to not only document my learning but also find and point our different ways in which this application can be made scalable, resilient and robust.
A high level idea is to write a server which takes the big url and return me a short url for the same. I have one more requirement where I do want to provide a slug i.e a custom short url path for the same. So for some links like https://play.google.com/store/apps/details?id=me.farhaan.bubblefeed, I want to have a url like url.farhaan.me/linktray which is easy to remember and distribute.
The way I am thinking to implement this is by having two components, I want a CLI interface which talks to my Server. I don’t want a fancy UI for now because I want it to be exclusively be used through terminal. A Client-Server architecture, where my CLI client sends a request to the server with a URL and an optional slug. If a slug is present URL will have that slug in it and if it doesn’t it generates a random string and make the URL small. If you see from a higher level it’s not just a URL shortner but also a URL tagger.
The way a simple url shortner works:
Flow Diagram
A client makes a request to make a given URL short, server takes the URL and stores it to the database, server then generates a random string and maps the URL to the string and returns a URL like url.farhaan.me/<randomstring>.
Now when a client requests to url.farhaan.me/<randomstring>, it goest to the same server, it searches the original URL and redirects the request to a different website.
The slug implementation part is very straightforward, where given a word, I might have to search the database and if it is already present we raise an error but if it isn’t we add it in the database and return back the URL.
One optimization, since it’s just me who is going to use this, I can optimize my database to see if the long URL already exists and if it does then no need to create a new entry. But this should only happen in case of random string and not in case of slugs. Also this is a trade off between reducing the redundancy and latency of a request.
But when it comes to generating a random string, things get a tiny bit complicated. This generation of random strings, decides how many URLs you can store. There are various hashing algorithms that I can use to generate a string I can use md5, base10 or base64. I also need to make sure that it gives a unique hash and not repeated ones.
Unique hash can be maintained using a counter, the count either can be supplied from a different service which can help us to scale the system better or it can be internally generated, I have used database record number for the same.
If you look at this on a system design front. We are using the same Server to take the request and generate the URL and to redirect the request. This can be separated into two services where one service is required to generate the URL and the other just to redirect the URL. This way we increase the availability of the system. If one of the service goes down the other will still function.
The next step is to write and integrate a CLI system to talk to the server and fetch the URL. A client that can be used for an end user. I am also planning to integrate a caching mechanism in this but not something out of the shelf rather write a simple caching system with some cache eviction policy and use it.
Till then I will be waiting for the feedback. Happy Hacking.
I now have a Patreon open so that you folks can support me to do this stuff for longer time and sustain myself too. So feel free to subscribe to me and help me keeping doing this with added benefits.
TLDR; Link Tray is a utility we recently wrote to curate links from different places and share it with your friends. The blogpost has technical details and probably some productivity tips.
Link Bubble got my total attention when I got to know about it, I felt it’s a very novel idea, it helps to save time and helps you to curate the websites you visited. So on the whole, and believe me I am downplaying it when I say Link Bubble does two things:
Saves time by pre-opening the pages
Helps you to keep a track of pages you want to visit
It’s a better tab management system, what I felt weird was building a whole browser to do that. Obviously, I am being extremely naive when I am saying it because I don’t know what it takes to build a utility like that.
Now, since they discontinued it for a while and I never got a chance to use it. I thought let me try building something very similar, but my use case was totally different. Generally when I go through blogs or articles, I open the links mentioned in a different tab to come back to them later. This has back bitten me a lot of time because I just get lost in so many links.
I thought if there is a utility which could just capture the links on the fly and then I could quickly go through them looking at the title, it might ease out my job. I bounced off the same idea across to Abhishek and we ended up prototyping LinkTray.
Our first design was highly inspired by facebook messenger but instead of chatheads we have links opened. If you think about it the idea feels very beautiful but the design is “highly” not scalable. For example if you have as many as 10 links opened we had trouble in finding our links of interest which was a beautiful design problems we faced.
We quickly went to the whiteboard and put up a list of requirements, first principles; The ask was simple:
To share multiple links with multiple people with least transitions
To be able to see what you are sharing
To be able to curate links (add/remove/open links)
We took inspiration from an actual Drawer where we flick out a bunch of links and go through them. In a serendipitous moment the design came to us and that’s how link tray looks like the way it looks now.
Link Tray
Link Tray was a technical challenge as well. There is a plethora of things I learnt about the Android ecosystem and application development that I knew existed but never ventured into exploring it.
Link Tray is written in Java, and I was using a very loosely maintained library to get the overlay activity to work. Yes, the floating activity or application that we see is called an overlay activity, this allows the application to be opened over an already running application.
The library that I was using doesn’t have support for Android O and above. To figure that out it took me a few nights , also because I was hacking on the project during nights . After reading a lot of GitHub issues I figured out the problem and put in the support for the required operating system.
One of the really exciting features that I explored about Android is Services. I think I might have read most of the blogs out there and all the documentation available and I know that I still don't know enough. I was able to pick enough pointers to make my utility to work.
Just like Uncle Bob says make it work and then make it better. There was a persistent problem, the service needs to keep running in the background for it to work. This was not a functional issue but it was a performance issue for sure and our user of version 1.0 did have a problem with it. People got mislead because there was constant notification that LinkTray is running and it was annoying. This looked like a simple problem on the face but was a monster in the depth.
Architecture of Link Tray
The solution to the problem was simple stop the service when the tray is closed, and start the service when the link is shared back to link tray. Tried, the service did stop but when a new link was shared the application kept crashing. Later I figured out the bound service that is started by the library I am using is setting a bound flag to True but when they are trying to reset this flag , they were doing at the wrong place, this prompted me to write this StackOverflow answer to help people understand the lifecycle of service. Finally after a lot of logs and debugging session I found the issue and fixed it. It was one of the most exciting moment and it help me learn a lot of key concepts.
The other key learning, I got while developing Link Tray was about multi threading, what we are doing here is when a link is shared to link tray, we need the title of the page if it has and favicon for the website. Initially I was doing this on the main UI thread which is not only an anti-pattern but also a usability hazard. It was a network call which blocks the application till it was completed, I learnt how to make a network call on a different thread, and keep the application smooth.
Initially approach was to get a webview to work and we were literally opening the links in a browser and getting the title and favicon out, this was a very heavy process. Because we were literally spawning a browser to get information about links, in the initial design it made sense because we were giving an option to consume the links. Over time our design improved and we came to a point where we don’t give the option to consume but to curate. Hence we opted for web scraping, I used custom headers so that we don’t get caught by robot.txt. And after so much of effort it got to a place where it is stable and it is performing great.
It did take quite some time to reach a point where it is right now, it is full functional and stable. Do give it a go if you haven’t, you can shoot any queries to me.
So, recently I started using windows for work. Why? There are a couple of reasons, one that I needed to use MSVC, that is the Microsoft Visual C++ toolchain and the other being, I wasn’t quite comfortable to ifdef stuff for making it work on GCC aka, the GNU counterpart of MSVC.
After an anxious month, I am writing a Krita Weekly again and probably this would be my last one too, though I hope not. Let’s start by talking about bugs. Unlike the trend going about the last couple of months, the numbers have taken a serious dip.
[Published in Open Source For You (OSFY) magazine, October 2017 edition.]
This article is the eighth in the DevOps series. In this issue, we shall learn to set up Docker in the host system and use it with Ansible.
Introduction
Docker provides operating system level virtualisation in the form of containers. These containers allow you to run standalone applications in an isolated environment. The three important features of Docker containers are isolation, portability and repeatability. All along we have used Parabola GNU/Linux-libre as the host system, and executed Ansible scripts on target Virtual Machines (VM) such as CentOS and Ubuntu.
Docker containers are extremely lightweight and fast to launch. You can also specify the amount of resources that you need such as CPU, memory and network. The Docker technology was launched in 2013, and released under the Apache 2.0 license. It is implemented using the Go programming language. A number of frameworks have been built on top of Docker for managing these cluster of servers. The Apache Mesos project, Google’s Kubernetes, and the Docker Swarm project are popular examples. These are ideal for running stateless applications and help you to easily scale them horizontally.
Setup
The Ansible version used on the host system (Parabola GNU/Linux-libre x86_64) is 2.3.0.0. Internet access should be available on the host system. The ansible/ folder contains the following file:
ansible/playbooks/configuration/docker.yml
Installation
The following playbook is used to install Docker on the host system:
The Parabola package repository is updated before proceeding to install the dependencies. The python2-docker package is required for use with Ansible. Hence, it is installed along with the docker package. The Docker daemon service is then started and the library/hello-world container is fetched and executed. A sample invocation and execution of the above playbook is shown below:
With verbose ’-v’ option to ansible-playbook, you will see an entry for LogPath, such as /var/lib/docker/containers//-json.log. In this log file you will see the output of the execution of the hello-world container. This output is the same when you run the container manually as shown below:
$ sudo docker run hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://cloud.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/engine/userguide/
Example
A Deep Learning (DL) Docker project is available (https://github.com/floydhub/dl-docker) with support for frameworks, libraries and software tools. We can use Ansible to build the entire DL container from the source code of the tools. The base OS of the container is Ubuntu 14.04, and will include the following software packages:
Tensorflow
Caffe
Theano
Keras
Lasagne
Torch
iPython/Jupyter Notebook
Numpy
SciPy
Pandas
Scikit Learn
Matplotlib
OpenCV
The playbook to build the DL Docker image is given below:
We first clone the Deep Learning docker project sources. The docker_image module in Ansible helps us to build, load and pull images. We then use the Dockerfile.cpu file to build a Docker image targeting the CPU. If you have a GPU in your system, you can use the Dockerfile.gpu file. The above playbook can be invoked using the following command:
Depending on the CPU and RAM you have, it will take considerable amount of time to build the image with all the software. So be patient!
Jupyter Notebook
The built dl-docker image contains Jupyter notebook which can be launched when you start the container. An Ansible playbook for the same is provided below:
- name: Start Jupyter notebook
hosts: localhost
gather_facts: true
become: true
tags: [notebook]
vars:
DL_DOCKER_NAME: "floydhub/dl-docker"
tasks:
- name: Run container for Jupyter notebook
docker_container:
name: "dl-docker-notebook"
image: "{{ DL_DOCKER_NAME }}:cpu"
state: started
command: sh run_jupyter.sh
You can invoke the playbook using the following command:
The Dockerfile already exposes the port 8888, and hence you do not need to specify the same in the above docker_container configuration. After you run the playbook, using the ‘docker ps’ command on the host system, you can obtain the container ID as indicated below:
$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a876ad5af751 floydhub/dl-docker:cpu "sh run_jupyter.sh" 11 minutes ago Up 4 minutes 6006/tcp, 8888/tcp dl-docker-notebook
You can now login to the running container using the following command:
$ sudo docker exec -it a876 /bin/bash
You can then run an ‘ifconfig’ command to find the local IP address (“172.17.0.2” in this case), and then open http://172.17.0.2:8888 in a browser on your host system to see the Jupyter Notebook. A screenshot is shown in Figure 1:
TensorBoard
TensorBoard consists of a suite of visualization tools to understand the TensorFlow programs. It is installed and available inside the Docker container. After you login to the Docker container, at the root prompt, you can start Tensorboard by passing it a log directory as shown below:
# tensorboard --logdir=./log
You can then open http://172.17.0.2:6006/ in a browser on your host system to see the Tensorboard dashboard as shown in Figure 2:
Docker Image Facts
The docker_image_facts Ansible module provides useful information about a Docker image. We can use it to obtain the image facts for our dl-docker container as shown below:
The ANSIBLE_STDOUT_CALLBACK environment variable is set to ‘json’ to produce a JSON output for readability. Some important image facts from the invocation of the above playbook are shown below:
[NOTE: This post originally appeared on
deepsource.io, and has been
posted here with due permission.]
In the early part of the last century, when David Hilbert was working on
stricter formalization of geometry than Euclid, Georg Cantor had worked out a
theory of different types of infinities, the theory of sets. This
theory would soon unveil a series of confusing paradoxes, leading to
a crisis in the Mathematics community regarding the stability of the
foundational principles of the math of that time.
Central to these paradoxes was the Russell’s paradox (or more generally, as
we’d talk about later, the Epimenides Paradox). Let’s see what it is.
In those simpler times, you were allowed to define a set if you could describe
it in English. And, owing to mathematicians’ predilection for self-reference,
sets could contain other sets.
Russell then, came up with this:
\(R\) is a set of all the sets which do not contain themselves.
The question was "Does \(R \) contain itself?" If it doesn’t, then according to
the second half of the definition it should. But if it does, then it no longer
meets the definition.
The same can symbolically be represented as:
Let \(R = \{ x \mid x \not \in x \} \), then \(R \in R \iff R \not \in R \)
Cue mind exploding.
“Grelling’s paradox” is a startling variant which uses adjectives instead of
sets. If adjectives are divided into two classes, autological
(self-descriptive) and heterological (non-self-descriptive), then, is
‘heterological’ heterological? Try it!
Epimenides Paradox
Or, the so-called Liar Paradox was another such paradox which shred apart
whatever concept of ‘computability’ was, at that time - the notion that things
could either be true or false.
Epimenides was a Cretan, who made one immortal statement:
“All Cretans are liars.”
If all Cretans are liars, and Epimenides was a Cretan, then he was lying when
he said that “All Cretans are liars”. But wait, if he was lying then, how can
we ‘prove’ that he wasn’t lying about lying? Ein?
This is what makes it a paradox: A statement so rudely violating the assumed
dichotomy of statements into true and false, because if you tentatively think
it’s true, it backfires on you and make you think that it is false. And a
similar backfire occurs if you assume that the statement is false. Go ahead,
try it!
If you look closely, there is one common culprit in all of these paradoxes,
namely ‘self-reference’. Let’s look at it more closely.
Strange Loopiness
If self-reference, or what Douglas Hofstadter - whose prolific work on the
subject matter has inspired this blog post - calls ‘Strange Loopiness’ was the
source of all these paradoxes, it made perfect sense to just banish
self-reference, or anything which allowed it to occur. Russell and Whitehead,
two rebel mathematicians of the time, who subscribed to this point of view, set
forward and undertook the mammoth exercise, namely “Principia Mathematica”,
which we as we will see in a little while, was utterly demolished by Gödel’s
findings.
The main thing which made it difficult to ban self-reference was that it was
hard to pin point where exactly did the self-reference occur. It may as well be
spread out over several steps, as in this ‘expanded’ version of Epimenides:
The next statement is a lie.
The previous statement is true.
Russell and Whitehead, in P.M. then, came up with a multi-hierarchy set
theory to deal with this. The basic idea was that a set of the lowest ‘type’
could only contain ‘objects’ as members (not sets). A set of the next type
could then only either contain objects, or sets of lower types. This,
implicitly banished self-reference.
Since, all sets must have a type, a set ‘which contains all sets which are not
members of themselves’ is not a set at all, and thus you can say that Russell’s
paradox was dealt with.
Similarly, if an attempt is made towards applying the expanded Epimenides to
this theory, it must fail as well, for the first sentence to make a reference
to the second one, it has to be hierarchically above it - in which case, the
second one can’t loop back to the first one.
Thirty one years after David Hilbert set before the academia to rigorously
demonstrate that the system defined in Principia Mathematica was both
consistent (contradiction-free) and complete (i.e. every true statement
could be evaluated to true within the methods provided by P.M.), Gödel
published his famous Incompleteness Theorem. By importing the Epimenides
Paradox right into the heart of P.M., he proved that not just the
axiomatic system developed by Russell and Whitehead, but none of the
axiomatic systems whatsoever were complete without being inconsistent.
Clear enough, P.M. lost it’s charm in the realm of academics.
Before Gödel’s work too, P.M. wasn’t particularly loved as well.
Why?
It isn’t just limited to this blog post, but we humans, in general, have a diet
for self-reference - and this quirky theory severely limits our ability to
abstract away details - something which we love, not only as programmers, but
as linguists too - so much so, that the preceding paragraph, “It isn’t …
this blog … we humans …” would be doubly forbidden because the ‘right’
to mention ‘this blog post’ is limited only to something which is
hierarchically above blog posts, ‘metablog-posts’. Secondly, me (presumably a
human) belonging to the class ‘we’ can’t mention ‘we’ either.
Since, we humans, love self-reference so much, let’s discuss some ways in which
it can be expressed in written form.
One way of making such a strange loop, and perhaps the ‘simplest’ is using the
word ‘this’. Here:
This sentence is made up of eight words.
This sentence refers to itself, and is therefore useless.
This blog post is so good.
This sentence conveys you the meaning of ‘this’.
This sentence is a lie. (Epimenides Paradox)
Another amusing trick for creating a self-reference without using the word
‘this sentence’ is to quote the sentence inside itself.
Someone may come up with:
The sentence ‘The sentence contains five words’ contains five words.
But, such an attempt must fail, for to quote a finite sentence inside itself
would mean that the sentence is smaller than itself. However, infinite
sentences can be self-referenced this way.
The sentence
"The sentence
"The sentence
...etc
...etc
is infinitely long"
is infinitely long"
is infinitely long"
There’s a third method as well, which you already saw in the title - the Quine
method. The term ‘Quine’ was coined by Douglas Hofstadter in his book “Gödel
Escher, Bach” (which heavily inspires this blog post). When using this, the
self-reference is ‘generated’ by describing a typographical entity, isomorphic
to the quine sentence itself. This description is carried in two parts - one is
a set of ‘instructions’ about how to ‘build’ the sentence, and the other, the
‘template’ contains information about the construction materials required.
The Quine version of Epimenides would be:
“yields falsehood when preceded by it’s quotation” yields falsehood when preceded by it’s quotation
Before going on with ‘quining’, let’s take a moment and realize how awfully
powerful our cognitive capacities are, and what goes in our head when a
cognitive payload full of self-references is delivered - in order to decipher
it, we not only need to know the language, but also need to work out the
referent of the phrase analogous to ‘this sentence’ in that language. This
parsing depends on our complex, yet totally assimilated ability to handle the
language.
The idea of referring to itself is quite mind-blowing, and we keep doing it all
the time — perhaps, why it feels so ‘easy’ for us to do so. But, we aren’t born
that way, we grow that way. This could better be realized by telling someone
much younger “This sentence is wrong.”. They’d probably be confused - What
sentence is wrong?. The reason why it’s so simple for self-reference to occur,
and hence allow paradoxes, in our language, is well, our language. It allows
our brain to do the heavy lifting of what the author is trying to get through
us, without being verbose.
Back to Quines.
Reproducing itself
Now, that we are aware of how ‘quines’ can manifest as self-reference, it would
be interesting to see how the same technique can be used by a computer program
to ‘reproduce’ itself.
To make it further interesting, we shall choose the language most apt for the
purpose - brainfuck:
Running that program above produces itself as the output. I agree, it isn’t the
most descriptive program in the world, so written in Python below, is the
nearest we can go to describe what’s happening inside those horrible chains of
+’s and >’s:
The first line generates """ on the fly, which marks multiline strings in
Python.
Next two lines define the eniuq function, which prints the argument template
twice - once, plain and then surrounded with triple quotes.
The last 4 lines cleverly call this function so that the output of the program
is the source code itself.
Since we are printing in an order opposite of quining, the name of the function
is ‘quine’ reversed -> eniuq (name stolen from Hofstadter again)
Remember the discussion about how self-reference capitalizes on the processor?
What if ‘quining’ was a built-in feature of the language, providing what we in
programmer lingo call ‘syntactic sugar’?
Let’s assume that an asterisk, * in the brainfuck interpreter would copy the
instructions before executing them, what would then be the output of the
following program?
*
It’d be an asterisk again. You could make an argument that this is silly, and
should be counted as ‘cheating’. But, it’s the same as relying on the
processor, like using “this sentence” to refer to this sentence - you rely on
your brain to do the inference for you.
What if eniuq was a builtin keyword in Python? A perfect self-rep was then
just be a call away:
eniuq('eniuq')
What if quine was a verb in the English language? We could reduce a lot of
explicit cognitive processes required for inference. The Epimenides paradox
would then be:
“yields falsehood if quined” yields falsehood if quined
Now, that we are talking about self-rep, here’s one last piece of entertainment
for you.
If you take that absurd thing above, and move around in the cartesian plane for
the coordinates \(0 \le x \le 106, k \le y \le k + 17\), where \(k\) is a
544 digit integer (just hold on with me here), color every pixel black for
True, and white otherwise, you'd get: