How to get a list of all Subversion copyright usernames?

Question

How to get a list of all Subversion copyright usernames?

I am looking for an effective way to get a list of unique commit authors for the SVN repository as a whole or for a specific resource path. I could not find the SVN command specifically for this (and do not expect this), but I hope that there may be a better way that I already tried in Terminal (on OS X):

svn log --quiet | grep "^r" | awk '{print $3}' svn log --quiet --xml | grep author | sed -E "s:</?author>::g"

Any of them will give me one author’s name per line, but they both require you to filter out a sufficient amount of additional information. They also do not process duplicates of the same author name, so for many commits by several authors there is a ton of redundancy flowing through the wire. More often than not, I just want to see unique copyright user names. (In fact, it may be useful to indicate the number of crashes for each author, but even in these cases it would be better if aggregated data were sent instead.)

I usually work with access only for clients, so the svnadmin commands svnadmin less useful, but if necessary, I can ask for special support from the repository administrator if it is strictly necessary or much more efficient. The repositories I work with have tens of thousands of commits and many active users, and I don't want any inconvenience to anyone.

+57

svn commit unique username metadata

Quinn Taylor Mar 22 '10 at 19:07

source share

5 answers

In PowerShell, specify your place in the working copy and use this command.

 svn.exe log --quiet | ? { $_ -notlike '-*' } | % { ($_ -split ' \| ')[1] } | Sort -Unique

The output format of svn.exe log --quiet as follows:

 r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013) ------------------------------------------------------------------------ r20208 | dispy | 2013-12-04 16:33:53 +0000 (Wed, 04 Dec 2013) ------------------------------------------------------------------------ r20207 | lala | 2013-12-04 16:28:15 +0000 (Wed, 04 Dec 2013) ------------------------------------------------------------------------ r20206 | po | 2013-12-04 14:34:32 +0000 (Wed, 04 Dec 2013) ------------------------------------------------------------------------ r20205 | tinkywinky | 2013-12-04 14:07:54 +0000 (Wed, 04 Dec 2013)

Filter horizontal rules with ? { $_ -notlike '-*' } ? { $_ -notlike '-*' } .

 r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013) r20208 | dispy | 2013-12-04 16:33:53 +0000 (Wed, 04 Dec 2013) r20207 | lala | 2013-12-04 16:28:15 +0000 (Wed, 04 Dec 2013) r20206 | po | 2013-12-04 14:34:32 +0000 (Wed, 04 Dec 2013) r20205 | tinkywinky | 2013-12-04 14:07:54 +0000 (Wed, 04 Dec 2013)

Divide by ' \| ' ' \| ' to include the entry in the array.

 $ 'r20209 | tinkywinky | 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)' -split ' \| ' r20209 tinkywinky 2013-12-05 08:56:29 +0000 (Thu, 05 Dec 2013)

The second element is the name.

Create an array of each row and select the second element with % { ($_ -split ' \| ')[1] } .

 tinkywinky dispy lala po tinkywinky

Returns unique occurrences with Sort -Unique . This sorts the result as a side effect.

 dispy lala po tinkywinky

+33

Iain Elder Dec 05 '13 at 10:24

source share

I needed to do this on Windows, so I used the Windows Super Sed port ( http://www.pement.org/sed/ ) and replaced AWK and GREP Commands:

 svn log --quiet --xml | sed -n -e "s/<\/\?author>//g" -e "/[<>]/!p" | sort | sed "$!N; /^\(.*\)\n\1$/!P; D" > USERS.txt

This uses Windows “sorting,” which may not be available on all machines.

+9

Adam Rofer Nov 17 '10 at 23:35

source share

 svn log path-to-repo | grep '^r' | grep '|' | awk '{print $3}' | sort | uniq > committers.txt

This command has an extra grep '|' which eliminates false values. Otherwise, random commits starting with 'r' are included and therefore the words from the commit messages are returned.

+2

crankparty Sep 20 2018-12-12T00: 00Z

source share

A simple alternative:

 find . -name "*cpp" -exec svn log -q {} \;|grep -v "\-\-"|cut -d "|" -f 2|sort|uniq -c|sort -n

-2

user1822088 Aug 6 '14 at 18:00

source share

Mike DeSimone · Accepted Answer · 2010-03-22 19:13

To filter duplicates, execute the output and output using: sort | uniq sort | uniq . In this way:

 svn log --quiet | grep "^r" | awk '{print $3}' | sort | uniq

I would not be surprised if this is a way to do what you ask. Unix tools often expect the user to resort to processing and analysis with other tools.

PS Think about it, you can combine grep and awk ...

 svn log --quiet | awk '/^r/ {print $3}' | sort | uniq

PPS Per Kevin Reed ...

 svn log --quiet | awk '/^r/ {print $3}' | sort -u

P ³ .S. Per kan, using vertical columns instead of spaces as field separators, correctly handle names with spaces (also updated Python examples) ...

 svn log --quiet | awk -F ' \\\\|' '/^r/ {print $2}' | sort -u

For efficiency, you can make single-line Perl. I don't know Perl, which is good, so I wrap this up in Python:

 #!/usr/bin/env python import sys authors = set() for line in sys.stdin: if line[0] == 'r': authors.add(line.split('|')[1].strip()) for author in sorted(authors): print(author)

Or if you want a count:

 #!/usr/bin/env python from __future__ import print_function # Python 2.6/2.7 import sys authors = {} for line in sys.stdin: if line[0] != 'r': continue author = line.split('|')[1].strip() authors.setdefault(author, 0) authors[author] += 1 for author in sorted(authors): print(author, authors[author])

Then you run:

 svn log --quiet | ./authorfilter.py

How to get a list of all Subversion copyright usernames?

More articles: