In the context of git (and diff), what is "hunk",

I was looking for the definition of "hunk" while reading some git documentation.

I know that this means describing the difference between the two files and that it has a clearly defined format, but I could not recall a brief definition.

I tried to search using google, but there were a lot of false hits.

+5
source share
3 answers

And in the end I found this:

When comparing two files, diff finds sequences of lines common to both files interspersed in groups with different lines called hunks.

here: http://www.gnu.org/software/diffutils/manual/html_node/Hunks.html

This was exactly the short definition I was looking for. Hope this helps someone else!

+4
source

For your information, you can also read this simple explanation: https://mvtechjourney.wordpress.com/2014/08/01/git-stage-hunk-and-discard-hunk-sourcetree/

+1
source

The term "hunk" is really not specific to Git and comes from the gnu diffutil format . Even more succinctly:

Each piece shows one area where the files differ.

But Git's task is to determine the right boundaries for the hunk.

The rest of the answer helps illustrate what a piece in Git looks like:

After various heuristics (for example, the compaction of one that took place in Git 2.12), the Git maintainers settled on the indentation that was introduced in September. 2016 with Git 2.11, commit 433860f .

Some groups of added / deleted lines in diffs can move up or down because the lines at the edges of the group are not unique. Choosing good shifts for such groups is not a matter of correctness, but it definitely has a big impact on aesthetics .
For example, consider the following two differences.
The first is what standard Git produces:

--- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) { } if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} +if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { $smtp_server = $_; 

The following differential is equivalent, but obviously preferable from an aesthetic point of view:

 --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) { $initial_reply_to =~ s/(^\s+|\s+$)//g; } +if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { 

This patch teaches Git to select the best positions for such "sliders" using a heuristic that takes the positions of the nearest empty lines , and the indentation of adjacent lines .


With Git 2.14 (Q3 2017), this indentation heuristic will be the default!

See commit 1fa8a66 (May 08, 2017) by Jeff King ( peff ) .
See commit 33de716 (May 08, 2017) by stefanbeller .
See commit 37590ce , commit cf5e772 (May 08, 2017)) by Mark Filardo.
(Merger of Junio โ€‹โ€‹With Hamano - gitster - at commit 53083f8 , June 05, 2017

diff: enable default indent heuristics

This feature was included in v2.11 (released 2016-11-29), and we did not receive a negative review. On the contrary, all the feedback we received was positive.

Turn it on by default. Users who do not like this feature can turn it off by setting diff.indentHeuristic .

+1
source

All Articles