master TOC | main page | license

map -- making xargs simpler and more powerful at the same time!


The map command was something I wrote a long time ago and have used pretty much forever, in a sort of "taken for granted" way. I would never have put it out there if I had not, by chance, discovered something called GNU Parallel and started reading the huge list of examples on its pages.

And yet, casually looking at the examples, I found that map could do pretty much all of the generic ones! So much so that I sat down and started writing down map equivalents of GNU Parallel's examples, and before I knew it I was about half way through their list with only a few that map could not do! The end result was this feature comparison.

But...

(In all fairness, here's a list of things map can't/won't do which GNU Parallel can/will, although a lot of them are "kitchen sink" items!)

And that was when I decided to put this out there as its own little project.

If you use it, please let me know. Some quick documentation is right here in this file. Examples are here. map responds to -h as you would expect, if you need to refresh your memory.


concepts

'map' is like xargs in many ways, except for having very few options, and a fixed set of "replace strings", all using the % character.

Here are the highlights:

default replacement string

If no replacement string (%, %%, or variants) exist anywhere in the command, the default is to assume a '%%' at the end.

However, if the '-p' option is used without the '-n' option, the default becomes '%'.

details

single replacements

% is replaced by the current input line, with a trailing slash removed if present. %D is replaced by the directory name of the current filename. %B is the basename and %E is the extension. (This means that % is pretty much equal to %D/%B.%E).

As said above, these replacements use only one input line per run, so

seq 1 3 | map echo %

gives you

1
2
3

multiple replacements

Most often, you want all the arguments tacked on to one "run" of the the command. Do this by specifying a %%:

seq 1 3 | map echo %%

returns

1 2 3

Since this is the most common reason for using map, this is the default if you don't specify either % or %%:

seq 1 3 | map echo
# returns:
1 2 3

A %% (and similarly %%D, %%B, and %%E) get replaced by as many input lines as possible (subject to internal limit of command line length and the user-specified -n value if used).

Just like GNU Parallel, this replacement even works within a word, replicating the entire word:

seq 1 3 | map echo abc-%%-def

produces

abc-1-def abc-2-def abc-3-def

multiple jobs in parallel

When you run something like:

map -p 4 gzip *.pdf

you are running 4 jobs in parallel. This indicates that the job might be CPU bound (usually, though not always) so it's best to run each job on one input line rather than give it as many as it will take.

So when you run in parallel mode, the default is % because that is what makes sense.

specifying maximum arguments per invocation

However, if you use -n, (even if you are also using -p) the default switches back to %%. The logic is that specifying "maximum arguments per invocation" implicitly gives permission to actually have more than one argument, overriding the -p exception.

So yeah this is an exception to an exception but I don't think it's too hard to remember.

And if in doubt you can always specify what you want you know...

delimiter mode

Here's an example; more documentation may follow if anyone asks but notice the delimiter character (colon) and the specification of field 1 and field 7:

cat /etc/passwd | egrep -v 'nologin|bash' | map -d=: echo %1 use %7 as shell

The default delimiter is whitespace. For convenience, '-d=t' uses tabs. Anything else, like ':', is specified literally, like above.

Here's another example: report users who have some shell as login but no GECOS field:

 < /etc/passwd map -d=: -- '[[ %7 =~ sh ]] && [ -z "%5" ] && echo %1 || :'

IMPORTANT NOTES

Filenames with unusual characters

Map will work fine with such filenames except that you have to do some extra quoting for parallel mode (since that invokes xargs), or if you want to use redirection or multiple commands.

For example (assuming some command sending in a list of filenames), this will fail:

... | map 'echo -n `gzip <  %  | wc -c`; echo -n '*100/'; wc -c <  % ' | bc

but this will succeed:

... | map 'echo -n `gzip < "%" | wc -c`; echo -n '*100/'; wc -c < "%"' | bc

(by the way, this computes the size of the gzipped file as a percentage of the original)

OTHER WARNINGS