User manual¶
dataparsers is a simple module that wrappers around argparse to get command line argument
parsers from dataclasses. It can create type checkable command line argument parsers using dataclasses, which are
recognized by type checkers and can be used by autocomplete tools.
Basic usage¶
Create a dataclass describing your command line interface, and call parse() with the class:
# prog.py
from dataclasses import dataclass
from dataparsers import parse
@dataclass
class Args:
foo: str
bar: int = 42
args = parse(Args)
print("Printing `args`:")
print(args)
The dataclass fields that have a “default” value are turned into optional arguments, while the non default fields will
be positional arguments.
The script can then be used in the same way as used with argparse:
$ python prog.py -h
usage: prog.py [-h] [--bar BAR] foo
positional arguments:
foo
options:
-h, --help show this help message and exit
--bar BAR
And the resulting type of args is Args (recognized by type checkers and autocompletes):
$ python prog.py test --bar 12
Printing `args`:
Args(foo='test', bar=12)
Interactive parse, modify parsers and partial parsing¶
It is possible to pass arguments in code, in the same way as the original parse_args() method:
>>> parse(Args, ["newtest", "--bar", "32"])
Args(foo='newtest', bar=32)
To create a argument parser and not immediately parse the arguments (i.e., save it for later), use the make_parser()
function:
>>> parser = make_parser(Args)
It is also possible to parse only a few of the command-line arguments, passing the remaining arguments on to another
script or program, using the parse_known() function for this. It works much like parse() except that it does not
produce an error when extra arguments are present. Instead, it returns a two item tuple containing the populated class
and the list of remaining argument strings:
>>> @dataclass
... class Args:
... foo: bool
... bar: str
...
>>> parse_known(Args, ['--foo', '--badger', 'BAR', 'spam'])
(Args(foo=True, bar='BAR'), ['--badger', 'spam'])
All functions parse(), make_parser() and parse_known() accepts a parser=... keyword argument to modify an
existing parser:
>>> from argparse import ArgumentParser
>>> prev_parser = ArgumentParser(description="Existing parser")
>>> parse(Args, ["-h"], parser=prev_parser)
usage: [-h] [--bar BAR] foo
Existing parser
positional arguments:
foo
options:
-h, --help show this help message and exit
--bar BAR
Argument specification¶
To specify detailed information about each argument, call the arg() function on the dataclass fields:
# prog.py
from dataclasses import dataclass
from dataparsers import parse, arg
@dataclass
class Args:
foo: str = arg(help="foo help")
bar: int = arg(default=42, help="bar help")
args = parse(Args)
It allows to customize the interface:
$ python prog.py -h
usage: prog.py [-h] [--bar BAR] foo
positional arguments:
foo foo help
options:
-h, --help show this help message and exit
--bar BAR bar help
In general, the arg() function accepts all parameters that are used in the original add_argument() method (with few
exceptions) and some additional parameters. The default keyword argument used above makes the argument optional (i.e.,
passed with flags like --bar) except in some specific situations.
One parameter of add_argument() that are not possible to pass to arg() is the dest keyword argument. That’s
because the name of the class attribute is determined by the dataclass field name. So, it is not allowed to pass the
dest parameter.
The parameter type is one of the add_argument() parameters that is inferred from the dataclass field properties
when not present.
Aliases¶
The first parameter of the the original add_argument() method is name_or_flags, which is a series of flags, or a
simple argument name. This parameter can be passed to arg() function to define aliases for optional arguments:
@dataclass
class Args:
foo: str = arg(help="foo help")
bar: int = arg("-b", default=42, help="bar help")
args = parse(Args)
In this case, it also creates automatically a -- flag :
$ python prog.py -h
usage: prog.py [-h] [-b BAR] foo
positional arguments:
foo foo help
options:
-h, --help show this help message and exit
-b BAR, --bar BAR bar help
However, the parameter name_or_flags must be passed only with flags (i.e., starting with - or --). That’s because
doesn’t make sense to pass a simple not flag name, since the simple name normally determines the class attribute’s name,
which is already defined by the dataclass field name.
Automatic flag creation¶
One situation where the default keyword argument does not automatically makes the argument optional (i.e., creating a
-- flag) is when the parameter nargs is set equal to ? or *. That’s because this setting also allows that
positional arguments may use a default value in the original add_argument() method. So, the flags must be passed
explicitly to make the argument optional:
@dataclass
class Args:
bar: int = arg("--bar", default=42, nargs="?", help="bar help")
An alternative way to force the creation of the -- flag from the field name is by passing the additional keyword
argument make_flag=True:
@dataclass
class Args:
bar: int = arg(default=42, nargs="?", help="bar help", make_flag=True)
Both formats above produces the same interface:
$ python prog.py -h
usage: prog.py [-h] [--bar [BAR]]
options:
-h, --help show this help message and exit
--bar [BAR] bar help
Avoiding automatic flag creation¶
When only single - flags are passed to the arg() function, it also creates automatically a -- flag from the
dataclass field name (as shown in the example of the “Aliases” section). To prevent that from happening, pass
make_flag=False:
@dataclass
class Args:
bar: int = arg("-b", default=42, help="bar help", make_flag=False)
args = parse(Args)
Then, only the single - flags will be sent to the interface:
$ python prog.py -h
usage: prog.py [-h] [-b BAR]
options:
-h, --help show this help message and exit
-b BAR bar help
Booleans¶
Booleans attributes are always considered as flag arguments, using the "store_true" or "store_false" values for the
action parameter of the original add_argument() method. If the boolean field is created with no default value, the
flag is still automatically created and the default value of the parameter is set to False (this default value can be
modified by the keyword argument default_bool of the dataparser() decorator - see “Default for booleans”):
>>> @dataclass
... class Args:
... bar: bool
...
>>> make_parser(Args).print_help()
usage: [-h] [--bar]
options:
-h, --help show this help message and exit
--bar
>>> parse(Args, [])
Args(bar=False)
Decoupling code from the command line interface¶
The automatic flag creation does not happen when -- flags are already passed (unless it is forced by passing
make_flag=True):
@dataclass
class Args:
path: str = arg("-f", "--file-output", metavar="<filepath>", help="Text file to write output")
args = parse(Args)
print(args)
This may be the most common case when the intention is to decouple the command line interface from the class attribute names:
$ python prog.py -h
usage: prog.py [-h] [-f <filepath>]
options:
-h, --help show this help message and exit
-f <filepath>, --file-output <filepath>
Text file to write output
In this situation, the interface can be customized, and the flags are not related to the attribute names inside the code:
$ python prog.py --file-output myfile.txt
Args(path='myfile.txt')
Argument groups¶
Two important additional keyword arguments can be passed to the arg() function to specify “argument groups”:
group_title and mutually_exclusive_group_id.
Note
In v2.1, the introduction of 2 new keyword arguments for the arg() function (group and
mutually_exclusive_group) made it easier to specify groups and mutually exclusive groups at the class scope. See
“Argument groups using ClassVar”.
Conceptual grouping¶
The group_title defines the title (or the ID) of the argument group in which the argument may be included. The titled
group will be created later, by the method add_argument_group(), which is used just to separate the arguments in
simple more appropriate conceptual groups:
>>> @dataclass
... class Args:
... foo: str = arg(group_title="Group1")
... bar: str = arg(group_title="Group1")
... sam: str = arg(group_title="Group2")
... ham: str = arg(group_title="Group2")
...
>>> parser = make_parser(Args)
>>> parser.print_help()
usage: [-h] foo bar sam ham
options:
-h, --help show this help message and exit
Group1:
foo
bar
Group2:
sam
ham
Argument groups may have a description in addition to the name. To define the description of the argument group, see
the dataparser() decorator, which allows to define options for the ArgumentParser object.
Mutual exclusion¶
The mutually_exclusive_group_id defines the name (or the ID) of the mutually exclusive argument group in which the
argument may be included. The identified group will be created later, by the method add_mutually_exclusive_group(),
which is used in argparse to create mutually exclusive arguments:
>>> @dataclass
... class Args:
... foo: str = arg(mutually_exclusive_group_id="my_group")
... bar: str = arg(mutually_exclusive_group_id="my_group")
...
>>> parser = make_parser(Args)
>>> parser.print_help()
usage: [-h] [--foo FOO | --bar BAR]
options:
-h, --help show this help message and exit
--foo FOO
--bar BAR
With that, argparse will make sure that only one of the arguments in the mutually exclusive group was present on the
command line:
>>> parse(Args,['--foo','test','--bar','newtest'])
usage: [-h] [--foo FOO | --bar BAR]
: error: argument --bar: not allowed with argument --foo
Note
Mutually exclusive arguments are always optionals. If no flag is given, they will be created automatically from the
dataclass field names, regardless of the value of make_flag.
Mutually exclusive groups also accepts a required argument, to indicate that at least one of the mutually exclusive
arguments is required. To define the required status of the mutually exclusive argument group, see the dataparser()
decorator.
Identifying argument groups¶
Both parameters group_title and mutually_exclusive_group_id may be integers. This makes easier to prevent typos when
identifying the groups. For the group_title parameter, if an integer is given, it is used to identify the group, but
the value is not passed as title to the original add_argument_group() method (None is passed instead). This
prevents the integer to be printed in the displayed help message:
>>> @dataclass
... class Args:
... foo: str = arg(group_title=1)
... bar: str = arg(group_title=1)
... sam: str = arg(group_title=2)
... ham: str = arg(group_title=2)
...
>>>
>>> parser = make_parser(Args)
>>> parser.print_help()
usage: [-h] foo bar sam ham
options:
-h, --help show this help message and exit
foo
bar
sam
ham
Note
Mutually exclusive argument groups do not support the title and description arguments of the
add_argument_group() method. However, a mutually exclusive group can be added to an argument group that has a
title and description. This is achieved by passing both group_title and mutually_exclusive_group_id
parameters to the arg() function. If there is a conflict (i.e., same mutually exclusive group and different group
titles), the mutually exclusive group takes precedence.
Argument groups using ClassVar (v2.1+)¶
Two new additional keyword arguments were introduced in v2.1 with functionality analogue to the previous parameters.
The group and mutually_exclusive_group keyword arguments also accepts a predefined ClassVar, that can be
initialized using 2 new functions: group() and mutually_exclusive_group():
from typing import ClassVar
from dataclasses import dataclass
from dataparsers import arg, group
@dataclass
class Args:
my_first_group: ClassVar = group()
foo: str = arg(group=my_first_group)
bar: str = arg(group=my_first_group)
my_second_group: ClassVar = group()
sam: str = arg(group=my_second_group)
ham: str = arg(group=my_second_group)
Using ClassVar names makes it even more easier to prevent typos when identifying groups inside the class. Moreover:
the functions group() and mutually_exclusive_group() accepts the keyword arguments title, description and
required, respectively, which helps to describe the groups without the need of the dataparser() decorator:
>>> @dataclass
... class Args:
... my_first_group: ClassVar = group(title="Group1", description="First group description")
... my_1st_exclusive_group: ClassVar = mutually_exclusive_group(required=False)
... foo: str = arg(group=my_first_group, mutually_exclusive_group=my_1st_exclusive_group)
... bar: str = arg(group=my_first_group, mutually_exclusive_group=my_1st_exclusive_group)
... ...
... my_second_group: ClassVar = group(title="Group2", description="Second group description")
... my_2nd_exclusive_group: ClassVar = mutually_exclusive_group(required=True)
... sam: str = arg(group=my_second_group, mutually_exclusive_group=my_2nd_exclusive_group)
... ham: str = arg(group=my_second_group, mutually_exclusive_group=my_2nd_exclusive_group)
...
>>>
>>> make_parser(Args).print_help()
usage: [-h] [--foo FOO | --bar BAR] (--sam SAM | --ham HAM)
options:
-h, --help show this help message and exit
Group1:
First group description
--foo FOO
--bar BAR
Group2:
Second group description
--sam SAM
--ham HAM
OBS: The delimiter ( ) in the “usage” above indicates that the group is required, while the delimiter [ ] indicates
the optional status.
The group and mutually_exclusive_group keyword arguments still accepts integers and strings, keeping the
functionality compatible with the previous version parameters. When strings are passed to the group keyword argument,
it is associated to the group title.
The ClassVar defined with the functions group() and mutually_exclusive_group() are not populated at run time:
>>> args = parse(Args, ['--sam', 'wise'])
>>> print(args)
Args(foo=None, bar=None, sam='wise', ham=None)
>>> args.my_first_group
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Args' object has no attribute 'my_first_group'
Parser-level defaults¶
Most of the time, the attributes of the object returned by parse() will be fully determined by inspecting the
command-line arguments. However, there is the original argparse’s set_defaults() method that allows some additional
attributes to be determined without any inspection of the command line to be added. This functionality can be reproduced
with the default() function:
>>> from dataparsers import parse, default
>>> @dataclass
... class Args:
... foo: int
... bar: int = default(42)
... baz: str = default("badger")
...
>>> parse(Args, ["736"])
Args(foo=736, bar=42, baz='badger')
Parser-level defaults are the original
recommended useful way to work with multiple sub-parsers.
See the subparser() method in section “Subparsers” for examples.
One obvious difference of using this default() function in place of the original set_defaults() method is that this
function must be used for each argument separated.
Parser specifications¶
To specify detailed options to the created ArgumentParser object, use the dataparser() decorator:
>>> from dataparsers import dataparser, make_parser
>>> @dataparser(prog='MyProgram', description='A foo that bars')
... class Args:
... ...
...
>>> make_parser(Args).print_help()
usage: MyProgram [-h]
A foo that bars
options:
-h, --help show this help message and exit
In general, the dataparser() decorator accepts all parameters that are used in the original ArgumentParser
constructor, and some additional parameters.
Groups description and required status¶
Note
In v2.1, the introduction of 2 new functions (group() and mutually_exclusive_group()) and 2 new keyword
arguments for the arg() function (group and mutually_exclusive_group) made it easier to specify description
and required status of the groups at the class scope. These may be better than using the dataparser() decorator.
See “Argument groups using ClassVar”.
Two important additional parameters accepted by the dataparser() decorator are the dictionaries groups_descriptions
and required_mutually_exclusive_groups, whose keys should match some value of the arguments group_title or
mutually_exclusive_group_id passed to arg() function (strings or integers) :
>>> @dataparser(
... groups_descriptions={"Group1": "1st group description", "Group2": "2nd group description"},
... required_mutually_exclusive_groups={0: True, 1: False},
... add_help=False, # Disable automatic addition of `-h` or `--help` at the command line
... )
... class Args:
... foo: str = arg(group_title="Group1", mutually_exclusive_group_id=0)
... bar: int = arg(group_title="Group1", mutually_exclusive_group_id=0)
... sam: bool = arg(group_title="Group2", mutually_exclusive_group_id=1)
... ham: float = arg(group_title="Group2", mutually_exclusive_group_id=1)
...
>>> make_parser(Args).print_help()
usage: (--foo FOO | --bar BAR) [--sam | --ham HAM]
Group1:
1st group description
--foo FOO
--bar BAR
Group2:
2nd group description
--sam
--ham HAM
Default for booleans¶
Booleans atributes with no default field value (or without action and default keyword arguments passed to arg()
function) will receive its default value determining "store_const" action defined by the additional parameter
default_bool (which is defaults to False, i.e., action="store_true"):
>>> @dataparser
... class Args:
... foo: bool
...
>>> parse(Args, ["--foo"])
Args(foo=True)
>>>
>>> @dataparser(default_bool=True)
... class Args:
... foo: bool = arg(help="Boolean value")
...
>>> parse(Args, ["--foo"])
Args(foo=False)
Help formatter function¶
A last additional parameter accepted by the dataparser() decorator is the help_formatter function, which is used to
format the arguments help text, allowing the help formatting to be customized. This function must be defined accepting a
single str as first positional argument and returning the string formatted text, i.e., (str) -> str. When this
option is used, the formatter_class parameter passed to the ArgumentParser constructor is assumed to be
RawDescriptionHelpFormatter.
This project provides a built-in predefined function write_help(), that can be used in the help_formatter option to
preserve new line breaks and add blank lines between parameters descriptions:
>>> from dataparsers import arg, make_parser, dataparser, write_help
>>> @dataparser(help_formatter=write_help)
... class Args:
... foo: str = arg(
... default=12.5,
... help='''This description is printed as written here.
... It preserves lines breaks.''',
... )
... bar: float = arg(
... default=25.5,
... help='''This description is also formatted by `write_help` and
... it is separated from the previous by a blank line.
... The parameter has default value of %(default)s.''',
... )
...
>>>
>>> make_parser(Args).print_help()
usage: [-h] [--foo FOO] [--bar BAR]
options:
-h, --help show this help message and exit
--foo FOO This description is printed as written here.
It preserves lines breaks.
--bar BAR This description is also formatted by `write_help` and
it is separated from the previous by a blank line.
The parameter has default value of 25.5.
Subparsers (v2.1+)¶
To define subparsers (or sub commands) use a ClassVar
and initialize it with the function subparser(). This function accepts all parameters of the original add_parser()
method, except for name: the name of the subparser will receive the dataclass field name.
Note
Subparsers added to a subparser are not yet supported
To add an argument to the created subparser (instead of the main parser), use the subparser keyword argument of the
arg() function and assign to it the previously created field:
>>> from typing import ClassVar
>>> from dataparsers import dataparser, arg, subparser, parse
>>>
>>> @dataparser(prog="PROG")
... class Args:
... foo: bool = arg(help="foo help")
... ...
... a: ClassVar = subparser(help="a help")
... bar: int = arg(help="bar help", subparser=a)
... ...
... b: ClassVar = subparser(help="b help")
... baz: str = arg(make_flag=True, choices="XYZ", help="baz help", subparser=b)
...
>>> parse(Args, ["a", "12"])
Args(foo=False, bar=12, baz=None)
>>> parse(Args, ["--foo", "b", "--baz", "Z"])
Args(foo=True, bar=None, baz='Z')
As in the original module, when a help message is requested from a subparser, only the help for that particular parser
will be printed. The help message will not include parent parser or sibling parser messages. A help message for each
subparser command, however, can be given by supplying the help=... argument to subparser() as above:
>>> parse(Args, ["--help"])
usage: PROG [-h] [--foo] {a,b} ...
positional arguments:
{a,b}
a a help
b b help
options:
-h, --help show this help message and exit
--foo foo help
>>> parse(Args, ["a", "--help"])
usage: PROG a [-h] bar
positional arguments:
bar bar help
options:
-h, --help show this help message and exit
>>> parse(Args, ["b", "--help"])
usage: PROG b [-h] [--baz {X,Y,Z}]
options:
-h, --help show this help message and exit
--baz {X,Y,Z} baz help
The ClassVar defined with the function subparser() remains as a read-only class variable at run time (which is an
instance of type SubParser: a frozen dataclass with some fields):
>>> args = parse(Args)
>>> args.a
SubParser(defaults=None, kwargs=mappingproxy({'help': 'a help'}))
>>> args.a.defaults="test"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 4, in __setattr__
dataclasses.FrozenInstanceError: cannot assign to field 'defaults'
Subparsers group¶
It is not necessary to create the “subparsers group” when creating subparsers: the group is automatically created.
However, if you want to explicitly pass information to the “subparsers group”, then create a str field and initialize
it with the function subparsers(). This function accepts all parameters of the original add_subparsers() method
(except for dest, which automatically receives the dataclass field name):
>>> @dataparser(prog="PROG")
... class Args:
... foo: bool = arg(help="foo help")
... subparsers_group: str = subparsers(help="sub-command help")
... ...
... a: ClassVar = subparser(help="a help")
... bar: int = arg(help="bar help", subparser=a)
... ...
... b: ClassVar = subparser(help="b help")
... baz: str = arg(make_flag=True, choices="XYZ", help="baz help", subparser=b)
Some possible keyword arguments highlighted in the original add_subparsers() method are title=... and
description=.... When either is present, the subparser’s commands will appear in their own group in the help output:
>>> @dataclass
... class Args:
... subparsers_group: str = subparsers(
... title="subcommands",
... description="valid subcommands",
... help="additional help",
... )
... foo: ClassVar = subparser()
... bar: ClassVar = subparser()
...
>>> parse(Args, ["-h"])
usage: [-h] {foo,bar} ...
options:
-h, --help show this help message and exit
subcommands:
valid subcommands
{foo,bar} additional help
Subparsers defaults¶
One additional keyword argument of the function subparser() (i.e., beyond those of the the original add_parser()
method) is the defaults dictionary, which reproduce the functionality of the original set_defaults() method (or the
“main parser-level” default() function) for the created subparsers.
One caveat of using this functionality is that the function requires the dictionary keys to be defined previously as a
main parser-level default field, with the default() function:
>>> @dataclass
... class Args:
... foo: str = default()
... bar: ClassVar = subparser(defaults=dict(foo="spam"))
... baz: ClassVar = subparser(defaults=dict(foo="badger"))
...
>>> parse(Args, ['bar'])
Args(foo='spam')
>>> parse(Args, ['baz'])
Args(foo='badger')
Parser-level defaults with subparsers defaults are the original argparse’s
recommended way to handling multiple sub-parsers
(see below).
Handling sub-commands¶
Like in the original argparse module, there are 2 possible ways to parse to subparsers: (1) Using parser-level
defaults and (2) using the subparser name.
1. Using parser-level defaults¶
Parser-level defaults are the original effective way of handling sub-commands, combining the use of the subparser()
function with the defaults keyword argument dictionary, so that each subparser knows which Python function it should
execute. For example:
>>> from __future__ import annotations # necessary to annotate sub-command functions
>>> from typing import ClassVar, Callable
>>> from dataclasses import dataclass
>>> from dataparsers import arg, parse, subparser, default
>>>
>>> # sub-command functions
>>> def foo(args: Args):
... print(args.x * args.y)
...
>>> def bar(args: Args):
... print("((%s))" % args.z)
...
>>> @dataclass
... class Args:
... func: Callable = default()
... ...
... # parser for the "foo" command
... foo: ClassVar = subparser(defaults=dict(func=foo))
... x: int = arg("-x", default=1, make_flag=False, subparser=foo)
... y: float = arg(subparser=foo)
... ...
... # parser for the "bar" command
... bar: ClassVar = subparser(defaults=dict(func=bar))
... z: str = arg(subparser=bar)
...
>>> # parse the args and call whatever function was selected
>>> args = parse(Args, "foo 1 -x 2".split())
>>> args.func(args)
2.0
>>>
>>> # parse the args and call whatever function was selected
>>> args = parse(Args, "bar XYZYX".split())
>>> args.func(args)
((XYZYX))
This way, you can let parse() do the job of calling the appropriate function after argument parsing is complete.
According to the argparse documentation, associating functions with actions like this is typically the easiest way to
handle the different actions for each of your subparsers.
2. Using subparser name¶
If it is necessary to check the name of the subparser that was invoked, the str field “subparsers group” created with
the subparsers() function will work:
>>> from dataclasses import dataclass
>>> from dataparsers import arg, parse, subparser, subparsers
>>>
>>> @dataclass
... class Args:
... ...
... subparser_name: str = subparsers()
... ...
... s1: ClassVar = subparser()
... x: str = arg("-x", make_flag=False, subparser=s1)
... ...
... s2: ClassVar = subparser()
... y: str = arg(subparser=s2)
...
>>> parse(Args, ["s2", "frobble"])
Args(subparser_name='s2', x=None, y='frobble')