Items tagged with: git
#gitlab #github #git #reveal #selfhosting #foss #pages #powerpoint #Präsentation
we need this.
this grabbed me enough to share it immediately.
#forgefed decentralised protocol for #git #vcs project hosting.
#projecthosting #collaboration #distributed #fediverse #freesoftware #programming #development #softwaredevelopment #versiontracking #freedom
HN Discussion: https://news.ycombinator.com/item?id=19617589
Posted by griffinmb (karma: 703)
Post stats: Points: 166 - Comments: 25 - 2019-04-09T18:25:47Z
#HackerNews #announcing #building #git
In 2017, I wrote a series of articles about diff and merge algorithms, and promised that one day they would form part of a book about the Git version control system. In 2018 I didn’t write anything at all on this blog, and that’s because I was writing that book.
Just over a month ago, Building Git was published, and so far has sold 370 copies. I’ve done a couple of interviews about it (one for The Yak Shave, and one yet to be published), and one question you get asked a lot if you write a book is: why did you write it? So in this article I want to expand a little on what the book’s about, why I wrote it, and who I think it’s for.
The quick answer to why I wrote it is, well, for the hell of it. For the same reason I wrote those diff articles, I was interested in digging in and learning as much as I could, out of sheer curiosity, and I hope as much comes across to readers. It was always supposed to be a side project alongside my day job, not something I’d be working on full time. But that’s not a very useful answer, and it’s worth reflecting on why I was interested in doing it and what motivated me to finish it.
First, there’s the surface reason: Git is notoriously confusing to new users, and they are frequently instructed to learn how it works inside in order to understand the user interface. While this is true, it’s a slightly unsatisfying answer: this is really a design problem in Git rather than a shortcoming in its users. There are already numerous excellent write-ups of how git works at various levels of abstraction: Pro Git gives a good account, Gitlet is an accessible and concise demonstration of these concepts in code, and many other third-party implementations exist. Why write another one?
That brings us to the second-level motivation I had: writing an implementation of a complex program like Git exposes you to a very broad range of computer science topics, from abstract mathematical ideas like models of concurrent editing, to the details of the Unix filesystem API. Most technical books are very narrowly focussed and designed to be easily keyword-marketed: you want to learn this year’s hot framework, you can quickly find half a dozen books on it, you pick one, work through, put it on your CV, rinse repeat. There’s nothing wrong with this per se: people need to learn things for their jobs and the tech book market does a reasonable job of serving this need.
But there’s a kind of learning that this model does not lend itself to, and that’s seeing how all this disparate stuff fits together. Most books necessarily draw on other topics in order to build useful programs, but for reasons of scope they must assume the reader already knows these incidental topics. If you don’t know them, it can be unclear where to go find out more.
If you take a degree in a subject, someone has developed a curriculum to guide you through the field and integrate the various topics you learn along the way. As a self-directed learner, it’s much harder to find that story among the mountain of targeted books available, and so I wanted to write something that was more of an extended project. Not something you approach with a checklist of things to learn, but something that takes you on a journey through a lot of topics you didn’t even know existed, and shows you how they fit together into a big picture. Something that goes through the process of building a reasonably large program rather than showing you some toy examples and leaving the rest up to your imagination. I wanted to bridge the gap between the scope of typical educational examples, and the sort of system people work on in production.
A secondary effect of the book’s broad scope is how it changes the narrative. Most book examples are of a size where you can show the reader the entire thing at once, and then explain how it works. What this leaves out is the process of getting to the end design, and this is a topic that developers tend to struggle with a great deal, especially in legacy systems and refactoring. They know the end state they want a system to arrive at, but they find it hard to make the journey there incrementally, and being comfortable with the idea of deploying it gradually, leading to the One Big Rewrite model where you hack for six months and then do an incredibly risky roll-out.
Jit, the codebase that Building Git describes, is about 6,000 lines of Ruby code. I believe it’s impossible to describe such a codebase in a linear fashion where you show the end state of the project and attempt to explain it. A lot of it only makes sense if you go on the journey to get there, building up each piece of functionality in small increments, and refactoring when necessary. Looking only at the current state of a codebase leaves out a lot of information that you can only get from its history, and that’s why version control logs are so valuable. As well as the content in the book, Jit’s commit logs contain over 30,000 words of text, including a lot of things I couldn’t fit into the book – think of it as the extended footnotes. For example, in one commit I use some type theory to derive an abstraction that unifies two apparently different structures, so that a single class can work with both of them. I find it hard to explain why such abstractions exist without going through the process that led to them, and I think there’s a gap in technical literature to explore this outside of books specifically dedicated to the process of changing code.
Finally, there’s the big picture stuff. Why did I choose Ruby? Well you have to choose something, and I know Ruby, and that’s about all the justification I can offer. It’s not that I didn’t evaluate other languages, but for me Ruby led to the least amount of incidental complexity in the early material in terms of installation, project structure, build tooling and so on, and it has a rich enough standard library that you don’t need any third party code to do this project. I wanted the book to be able to cross language barriers and not be a Ruby book, and so I’m really glad people are following it in other languages. So far I know of people doing it in C++, Clojure, Elixir, Go, Haskell, Java, Node.js, Rust, and Swift. Having said that, I do believe Ruby lowered the language barrier more than I personally could have done in other languages, and doing this project reminded me of why I like it so much. That doesn’t mean it’s The Best Language, it was just the best language for me, for this project, for now.
But the choice of Ruby is also significant for cultural reasons. I’ve spent most of my career so far in web development, and Ruby is primarily known for its role in that space. It’s dismissed as a language you cannot write programs like Git in – it’s too slow, it’s too “high level” – and web developers are dismissed as not being “real programmers” by other parts of the tech ecosystem. What is a “real programmer”? I don’t know, I just know that people who work in my sector are often told they aren’t one. Real programmers know C. Real programmers work in systems programming. Real programmers do open source. Real programmers can tell you every member of every good data structure from 1962 to 1978. But they don’t write web apps, oh no.
There’s a huge section of the tech ecosystem that’s constantly told they’re not smart enough to be here and that their work doesn’t matter. I spent a decade hearing C was beyond mere mortals, that you must be a genius to go anyway near low-level code, or algorithms, or distributed systems. The inventor of Git is notorious for pushing this narrative! But the truth is, anyone with enough brains and patience to learn how to do any kind of computing is “smart enough” to learn things like this. The thing that makes any kind of programming hard is gigantic functions that do seven different tangentially related things and hide important concepts so that their file formats end up with half a dozen different ways to encode an integer. It’s programs like that where you can’t actually see the system design in the code; if your codebase looks like that it’s going to be difficult in any language.
So, I wrote this book for the not-real programmers, the people told they’re not hardcore enough, that programs like Git are written by brain geniuses and that mere mortals cannot understand them. I was inspired by Gary Bernhardt’s From Scratch videos, and by Julia Evans’s zines, that demystify everyday software tools. I want you to feel like there’s nothing about computers you can’t ultimately figure out, and it has been so rewarding to see people publishing their first commits with their own Git clones, full of excitement at the world they’ve just created.
You can buy Building Git via my shop, and if you enjoy it I’d really appreciate a review on Goodreads.
HackerNewsBot debug: Calculated post rank: 119 - Loop: 251 - Rank min: 100 - Author rank: 73
HN Discussion: https://news.ycombinator.com/item?id=19570289
Posted by davvid (karma: 1535)
Post stats: Points: 107 - Comments: 30 - 2019-04-04T08:41:55Z
#HackerNews #format-patch #git #problems #with
HackerNewsBot debug: Calculated post rank: 81 - Loop: 120 - Rank min: 80 - Author rank: 40
git, implemented in rust, for fun and education 🦀: - chrisdickinson/git-rs
Article word count: 34
HN Discussion: https://news.ycombinator.com/item?id=19540845
Posted by adamnemecek (karma: 49429)
Post stats: Points: 151 - Comments: 42 - 2019-04-01T05:49:14Z
#HackerNews #git #implemented #rust
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
HackerNewsBot debug: Calculated post rank: 114 - Loop: 147 - Rank min: 100 - Author rank: 75
Lets assume that you have a commit C with hunk H that you want to remove from that C and instead you want H to be part of your uncommitted changes.
For that we will commit a fixup! to C a reversed patch R of H.
- Get into the magit-status screen
- Put your working changes aside
z(both) and type "restore me soon" and hit
- Drill down to find the H inside C:
- select the commit C to edit and hit
- select the file or hunk to remove and --
- Create R - an unstaged reverse of H
- Stage R
- select R and hit
- select R and hit
- Create unstaged H
- select R and ht
- select R and ht
- fixup! C with R
f(fixup) the staged change
- select commit C
C-c C-c(magit-select-pick) to commit the reverse patch
- Rebase the fixup!
- select commit C
- Bring back the changes you set aside in (1)
p(pop) select "restore me soon" and hit
Now that we have repositories, putting things inside them is in order. Also, repositories are boring, and writing a Git implementation shouldn’t be just a matter of writing a bunch of mkdir. Let’s…
Article word count: 749
HN Discussion: https://news.ycombinator.com/item?id=19386141
Posted by adamnemecek (karma: 48962)
Post stats: Points: 136 - Comments: 4 - 2019-03-14T02:50:45Z
#HackerNews #git #write #yourself
Now that we have repositories, putting things inside them is in order. Also, repositories are boring, and writing a Git implementation shouldn’t be just a matter of writing a bunch of mkdir. Let’s talk about objects, and let’s implement git hash-object and git cat-file.
Maybe you don’t know these two commands — they’re not exactly part of an everyday git toolbox, and they’re actually quite low-level (“plumbing”, in git parlance). What they do is actually very simple: hash-object converts an existing file into a git object, and cat-file prints an existing git object to the standard output.
Now, what actually is a Git object? At its core, Git is a “content-addressed filesystem”. That means that unlike regular filesystems, where the name of a file is arbitrary and unrelated to that file’s contents, the names of files as stored by Git are mathematically derived from their contents. This has a very important implication: if a single byte of, say, a text file, changes, its internal name will change, too. To put it simply: you don’t modify a file, you create a new file in a different location. Objects are just that: files in the git repository, whose path is determined by their contents.
Git is not (really) a key-value store
Some documentation, including the excellent Pro Git, call Git a “key-value store”. This is not incorrect, but may be misleading. Regular filesystems are actually closer to a key-value store than Git is. Because it computes keys from data, Git should rather be called a value-value store.
Git uses objects to store quite a lot of things: first and foremost, the actual files it keeps in version control — source code, for example. Commit are objects, too, as well as tags. With a few notable exceptions (which we’ll see later!), almost everything, in Git, is stored as an object.
The path is computed by calculating the SHA-1 hash of its contents. More precisely, Git renders the hash as a lowercase hexadecimal string, and splits it in two parts: the first two characters, and the rest. It uses the first part as a directory name, the rest as the file name (this is because most filesystems hate having too many files in a single directory and would slow down to a crawl. Git’s method creates 256 possible intermediate directories, hence dividing the average number of files per directory by 256)
What is a hash function?
Simply put, a hash function is a kind of unidirectional mathematical function: it is easy to compute the hash of a value, but there’s no way to compute which value produced a hash. A very simple example of a hash function is the strlen function. It’s really easy to compute the length of a string, and the length of a given string will never change (unless the string itself changes, of course!) but it’s impossible to retrieve the original string, given only its length. Cryptographic hash functions are just a much more complex version of the same, with the added property that computing an input meant to produce a given hash is hard enough to be practically impossible. (With strlen, producing an input i with strlen(i) == 12, you just have to type twelve random characters. With algorithms such as SHA-1. it would take much, much longer — long enough to be practically impossible^1.
Before we start implementing the object storage system, we must understand their exact storage format. A object start by an header that specify its type: blob, commit, tag or tree. This header is followed by an ASCII space (0x20), then the size of the object in bytes as an ASCII number, then null (0x00) (the null byte), then the contents of the object. The first 48 bytes of a commit object in Wyag’s repo look like this:
00000000 63 6f 6d 6d 69 74 20 31 30 38 36 00 74 72 65 65 |commit 1086.tree|
00000010 20 32 39 66 66 31 36 63 39 63 31 34 65 32 36 35 | 29ff16c9c14e265|
00000020 32 62 32 32 66 38 62 37 38 62 62 30 38 61 35 61 |2b22f8b78bb08a5a|
In the first line, we see the type header, a space (0x20), the size in ASCII (1086) and the null separator 0x00. The last four bytes on the first line are the beginning of that object’s contents, the word “tree” — we’ll discuss that further when we’ll talk about commits.
The objects (headers and contents) are stored compressed with zlib.
HackerNewsBot debug: Calculated post rank: 92 - Loop: 102 - Rank min: 80 - Author rank: 76
I’m still on github but I’m not planning to put more code in it since now is became a part of microsoft. Is not a pure “extremist” matter but a company that had a CEO that disrespected free software since 1970+ doesn’t deserve to own similar platforms.
#gitlab #git #ci #testing #automation #programming
I guess the real answer is to not use those command line programs and instead use the emacs alternatives like magit and so on? #emacs #git
It’s Magit! And you’re the magician! · Endless Parentheses[l]
There’s nothing I can praise about Magit that hasn’t been written in a dozen blogs already, but since Jonas started a kickstarter campaign for it I knew I had to say something. If you use Magit, you already know the greatness of it. And if you don’t, hopefully I can convince you to try it in time to back the campaign.
— Permalink- - - - - -