API Documentation

insights.core

class insights.core.CommandParser(context, extra_bad_lines=None)[source]

Bases: Parser

This class checks output from the command defined in the spec.

Raises

ContentException -- When context.content contains a single line and that line contains one of the string in the bad_single_lines or extra_bad_lines list. Or, when context.content contains multiple lines and there is one line contains one of the string in the bad_lines or extra_bad_lines list.

static validate_lines(results, bad_single_lines, bad_lines)[source]

This function returns False when:

1. If the `results` is a single line and that line contains
   one of the string in the `bad_single_lines` list.
2. If the `results` contains multiple lines and there is one line
   contains one of the string in the `bad_lines` list.

If no bad line is found the function returns True.

Parameters
  • results (list) -- The results string of the output from the command defined by the command spec.

  • bad_single_lines (list) -- The list of bad lines should be checked only when the result contains a single line.

  • bad_lines (list) -- The list of bad lines should be checked only when the result contains multiple lines.

Returns

True for no bad lines or False for bad line found.

Return type

(Boolean)

class insights.core.ConfigCombiner(confs, main_file, include_finder)[source]

Bases: ConfigComponent

Base Insights component class for Combiners of configuration files with include directives for supplementary configuration files. httpd and nginx are examples.

find_main(name)[source]
find_matches(confs, pattern)[source]
class insights.core.ConfigComponent[source]

Bases: object

property directives
find(*queries, **kwargs)[source]

Finds matching results anywhere in the configuration

find_all(*queries, **kwargs)

Finds matching results anywhere in the configuration

property sections
select(*queries, **kwargs)[source]

Given a list of queries, executes those queries against the set of Nodes. A Node has three primary attributes: name (str), attrs ([str|int]), and children ([Node]).

Nodes also have a value attribute that is either the first attribute (in the case of simple directives that only have one), or the string representation of all attributes joined by a single space.

Each positional argument to select represents a query against the name and/or attributes of the corresponding level of the configuration tree. The first argument queries root nodes, the second argument queries children of the root nodes, etc.

An individual query is either a single value or a tuple. A single value queries the name of a Node. A tuple queries the name and the attrs.

So: select(name_predicate) or select((name_predicate, attrs_predicate))

In general, select(pred1, pred2, pred3, …)

If a predicate is a simple value (string or int), an exact match is required for names, and an exact match of any attribute is required for attributes.

Examples: select(“Directory”) queries for all root nodes named Directory.

select(“Directory”, “Options”) queries for all root nodes named Directory that contain at least one child node named Options. Notice the argument positions: Directory is in position 1, and Options is in position 2.

select((“Directory”, “/”)) queries for all root nodes named Directory that contain an attribute exactly matching “/”. Notice this is one argument to select: a 2-tuple with predicates for name and attrs.

If you are only interested in attributes, just pass None for the name predicate in the tuple: select((None, “/”)) will return all root nodes with at least one attribute of “/”

In addition to exact matches, the elements of a query can be functions that accept the value corresponding to their position in the query. A handful of useful functions and boolean operators between them are provided.

select(startswith(“Dir”)) queries for all root nodes with names starting with “Dir”.

select(~startswith(“Dir”)) queries for all root nodes with names not starting with “Dir”.

select(startswith(“Dir”) | startswith(“Ali”)) queries for all root nodes with names starting with “Dir” or “Ali”. The return of | is a single callable passed in the first argument position of select.

select(~startswith(“Dir”) & ~startswith(“Ali”)) queries for all root nodes with names not starting with “Dir” or “Ali”.

If a function is in an attribute position, it is considered True if it returns True for any attribute.

For example, select((None, 80)) often will return the list of one Node [Listen 80]

select((“Directory”, startswith(“/var”))) will return all root nodes named Directory that also have an attribute starting with “/var”

If you know that your selection will only return one element, or you only want the first or last result of the query , pass one=first or one=last.

select((“Directory”, startswith(“/”)), one=last) will return the single root node for the last Directory entry starting with “/”

If instead of the root nodes that match you want the child nodes that caused the match, pass roots=False.

node = select((“Directory”, “/var/www/html”), “Options”, one=last, roots=False) might return the Options node if the Directory for “/var/www/html” was defined and contained an Options Directive. You could then access the attributes with node.attrs. If the query didn’t match anything, it would have returned None.

If you want to slide the query down the branches of the config, pass deep=True to select. That allows you to do conf.select(“Directory”, deep=True, roots=False) and get back all Directory nodes regardless of nesting depth.

conf.select() returns everything.

Available predicates are: & (infix boolean and) | (infix boolean or) ~ (prefix boolean not)

For ints or strings: eq (==) e.g. conf.select(“Directory, (“StartServers”, eq(4))) ge (>=) e.g. conf.select(“Directory, (“StartServers”, ge(4))) gt (>) le (<=) lt (<)

For strings: contains endswith startswith

class insights.core.ConfigParser(context)[source]

Bases: Parser, ConfigComponent

Base Insights component class for Parsers of configuration files.

Raises

SkipException -- When input content is empty.

lineat(pos)[source]
parse_content(content)[source]

This method must be implemented by classes based on this class.

parse_doc(content)[source]
class insights.core.ContainerConfigCombiner(confs, main_file, include_finder, engine, image, container_id)[source]

Bases: ConfigCombiner

Base Insights component class for Combiners of container configuration files with include directives for supplementary configuration files. httpd and nginx are examples.

property conf_path
container_id

The ID of the container.

Type

str

engine

The engine provider of the container.

Type

str

image

The image of the container.

Type

str

class insights.core.ContainerParser(context)[source]

Bases: Parser

A class specifically for container parser, with the “image” name, the engine provider and the container ID on the basis of Parser.

container_id

The ID of the container.

Type

str

engine

The engine provider of the container.

Type

str

image

The image of the container.

Type

str

class insights.core.FileListing(context)[source]

Bases: Parser

Reads a series of concatenated directory listings and turns them into a dictionary of entities by name. Stores all the information for each directory entry for every entry that can be parsed, containing:

  • type (one of [bcdlps-])

  • permission string including ACL character

  • number of links

  • owner and group (as given in the listing)

  • size, or major and minor number for block and character devices

  • date (in the format given in the listing)

  • name

  • name of linked file, if a symlink

In addition, the raw line is always stored, even if the line doesn’t look like a directory entry.

Also provides a number of other conveniences, such as:

  • lists of regular and special files and subdirectory names for each directory, in the order found in the listing

  • total blocks allocated to all the entities in this directory

Note

For listings that only contain one directory, ls does not output the directory name. The directory is reverse engineered from the path given to the parser by Insights - this assumes the translation of spaces to underscores and ‘/’ to ‘.’ in paths. For example, ls -l /var/www/html will be translated to ls_-l_.var.www.html. The reverse translation will make mistakes, for example in translating .etc.yum.repos.d to /etc/yum/repos/d. Use caution in checking the paths when requesting single directories.

Parses the SELinux information if present in the listing. SELinux directory listings contain:

  • the type of file

  • the permissions block

  • the owner and group as given in the directory listing

  • the SELinux user, role, type and MLS

  • the name, and link destination if it’s a symlink

Sample input data looks like this:

/example_dir:
total 20
dr-xr-xr-x. 3 0 0 4096 Mar 4 16:19 .
-rw-r--r--. 1 0 0 123891 Aug 25 2015 config-3.10.0-229.14.1.el7.x86_64
lrwxrwxrwx. 1 0 0 11 Aug 4 2014 menu.lst -> ./grub.conf
brw-rw----. 1 0 6 253, 10 Aug 4 16:56 dm-10
crw-------. 1 0 0 10, 236 Jul 25 10:00 control

Examples

>>> file_listing
<insights.core.FileListing at 0x7f5319407450>
>>> '/example_dir' in file_listing
True
>>> file_listing.dir_contains('/example_dir', 'menu.lst')
True
>>> dir = file_listing.listing_of('/example_dir')
>>> dir['.']['type']
'd'
>>> dir['config-3.10.0-229.14.q.el7.x86_64']['size']
123891
>>> dir['dm-10']['major']
253
>>> dir['menu.lst']['link']
'./grub.conf'
dir_contains(directory, name)[source]

Does this directory contain this entry name?

dir_entry(directory, name)[source]

The parsed data for the given entry name in the given directory.

dirs_of(directory)[source]

The list of subdirectories in the given directory.

files_of(directory)[source]

The list of non-special files (i.e. not block or character files) in the given directory.

listing_of(directory)[source]

The listing of this directory, in a dictionary by entry name. All entries contain the original line as is in the ‘raw_entry’ key. Entries that can be parsed then have fields as described in the class description above.

parse_content(content)[source]

Called automatically to process the directory listing(s) contained in the content.

path_entry(path)[source]

The parsed data given a path, which is separated into its directory and entry name.

specials_of(directory)[source]

The list of block and character special files in the given directory.

total_of(directory)[source]

The total blocks of storage consumed by entries in this directory.

class insights.core.IniConfigFile(context)[source]

Bases: ConfigParser

A class specifically for reading configuration files in ‘ini’ format.

The input file format supported by this class is:

[section 1]
key = value
; comment
# comment
[section 2]
key with spaces = value string
[section 3]
# Must implement parse_content in child class
# and pass allow_no_value=True to parent class
# to enable keys with no values
key_with_no_value

Examples

>>> class MyConfig(IniConfigFile):
...     pass
>>> content = '''
... [defaults]
... admin_token = ADMIN
... [program opts]
... memsize = 1024
... delay = 1.5
... [logging]
... log = true
... logging level = verbose
... '''.split()
>>> my_config = MyConfig(context_wrap(content, path='/etc/myconfig.conf'))
>>> 'program opts' in my_config
True
>>> my_config.sections()
['program opts', 'logging']
>>> my_config.defaults()
{'admin_token': 'ADMIN'}
>>> my_config.items('program opts')
{'memsize': 1024, 'delay': 1.5}
>>> my_config.get('logging', 'logging level')
'verbose'
>>> my_config.getint('program opts', 'memsize')
1024
>>> my_config.getfloat('program opts', 'delay')
1.5
>>> my_config.getboolean('logging', 'log')
True
>>> my_config.has_option('logging', 'log')
True
property data

Returns: obj: self, it’s for backward compatibility.

defaults()[source]
Returns

Returns any options under the DEFAULT section.

Return type

dict

get(section, option)[source]
Parameters
  • section (str) -- The section str to search for.

  • option (str) -- The option str to search for.

Returns

Returns the value of the option in the specified section.

Return type

str

getboolean(section, option)[source]
Returns

Returns boolean form based on the data from get.

Return type

bool

getfloat(section, option)[source]
Returns

Returns the float value off the data from get.

Return type

float

getint(section, option)[source]
Returns

Returns the int value off the data from get.

Return type

int

has_option(section, option)[source]
Parameters
  • section (str) -- The section str to search for.

  • option (str) -- The option str to search for.

Returns

Returns weather the option in the section exist.

Return type

bool

items(section)[source]
Parameters

section (str) -- The section str to search for.

Returns

Returns all of the options in the specified section.

Return type

dict

parse_content(content, allow_no_value=False)[source]

This method must be implemented by classes based on this class.

parse_doc(content)[source]
sections()[source]
Returns

Returns all of the parsed sections excluding DEFAULT.

Return type

list

set(section, option, value=None)[source]

Sets the value of the specified section option.

Parameters
  • section (str) -- The section str to set for.

  • option (str) -- The option str to set for.

  • value (str) -- The value to set.

class insights.core.JSONParser(context)[source]

Bases: Parser, LegacyItemAccess

A parser class that reads JSON files. Base your own parser on this.

parse_content(content)[source]

This method must be implemented by classes based on this class.

class insights.core.LegacyItemAccess[source]

Bases: object

Mixin class to provide legacy access to self.data attribute.

Provides expected passthru functionality for classes that still use self.data as the primary data structure for all parsed information. Use this as a mixin on parsers that expect these methods to be present as they were previously.

Examples

>>> class MyParser(LegacyItemAccess, Parser):
...     def parse_content(self, content):
...         self.data = {}
...         for line in content:
...             if 'fact' in line:
...                 k, v = line.split('=')
...                 self.data[k.strip()] = v.strip()
>>> content = '''
... # Comment line
... fact1=fact 1
... fact2=fact 2
... fact3=fact 3
... '''.strip()
>>> my_parser = MyParser(context_wrap(content, path='/etc/path_to_content/content.conf'))
>>> my_parser.data
{'fact1': 'fact 1', 'fact2': 'fact 2', 'fact3': 'fact 3'}
>>> my_parser.file_path
'/etc/path_to_content/content.conf'
>>> my_parser.file_name
'content.conf'
>>> my_parser['fact1']
'fact 1'
>>> 'fact2' in my_parser
True
>>> my_parser.get('fact3', default='no fact')
'fact 3'
get(item, default=None)[source]

Returns value of key item in self.data or default if key is not present.

Parameters
  • item -- Key to get from self.data.

  • default -- Default value to return if key is not present.

Returns

String value of the stored item, or the default if not found.

Return type

(str)

class insights.core.LogFileOutput(context)[source]

Bases: Parser

Class for parsing log file content.

Log file content is stored in raw format in the lines attribute.

Assume the log file content is:

Log file line one
Log file line two
Log file line three, and more

Examples

>>> class MyLogger(LogFileOutput):
...     pass
>>> MyLogger.keep_scan('get_one', 'one')
>>> MyLogger.keep_scan('get_three_and_more', ['three', 'more'])
>>> MyLogger.keep_scan('get_one_or_two', ['one', 'two'], check=any)
>>> MyLogger.last_scan('last_line_contains_file', 'file')
>>> MyLogger.keep_scan('last_2_lines_contain_file', 'file', num=2, reverse=True)
>>> MyLogger.keep_scan('last_3_lines_contain_line_and_t', ['line', 't'], num=3, reverse=True)
>>> MyLogger.token_scan('find_more', 'more')
>>> MyLogger.token_scan('find_four_and_more', ['four', 'more'])
>>> MyLogger.token_scan('find_four_or_more', ['four', 'more'], check=any)
>>> my_logger = MyLogger(context_wrap(contents, path='/var/log/mylog'))
>>> my_logger.file_path
'/var/log/mylog'
>>> my_logger.file_name
'mylog'
>>> my_logger.get('two')
[{'raw_message': 'Log file line two'}]
>>> 'line three,' in my_logger
True
>>> my_logger.get(['three', 'more'])
[{'raw_message': 'Log file line three, and more'}]
>>> my_logger.lines[0]
'Log file line one'
>>> my_logger.get_one
[{'raw_message': 'Log file line one'}]
>>> my_logger.get_three_and_more == my_logger.get(['three', 'more'])
True
>>> my_logger.last_line_contains_file
{'raw_message': 'Log file line three, and more'}
>>> len(my_logger.last_2_lines_contain_file)
2
>>> len(my_logger.last_3_lines_contain_line_and_t)  # Only 2 lines contain 'line' and 't'
2
>>> my_logger.find_more
True
>>> my_logger.find_four_and_more
False
>>> my_logger.find_four_or_more
True
lines

List of the lines from the log file content.

Type

list

get(s, check=<built-in function all>, num=None, reverse=False)[source]

Returns all lines that contain s anywhere and wrap them in a list of dictionaries. s can be either a single string or a string list. For list, all keywords in the list must be found in each line.

Parameters
  • s (str or list) -- one or more strings to search for

  • check (func) -- built-in function all or any applied to each line

  • num (int) -- the number of lines to get, None for unlimited

  • reverse (bool) -- scan start from the head when False by default, otherwise start from the tail

Returns

list of dictionaries corresponding to the parsed lines contain the s.

Return type

(list)

Raises

TypeError -- When s is not a string or a list of strings, or num is not an integer.

get_after(timestamp, s=None)[source]

Find all the (available) logs that are after the given time stamp.

If s is not supplied, then all lines are used. Otherwise, only the lines contain the s are used. s can be either a single string or a string list. For list, all keywords in the list must be found in each line.

This method then finds all lines which have a time stamp after the given timestamp. Lines that do not contain a time stamp are considered to be part of the previous line and are therefore included if the last log line was included or excluded otherwise.

Time stamps are recognised by converting the time format into a regular expression which matches the time format in the string. This is then searched for in each line in turn. Only lines with a time stamp matching this expression will trigger the decision to include or exclude lines. Therefore, if the log for some reason does not contain a time stamp that matches this format, no lines will be returned.

The time format is given in strptime() format, in the object’s time_format property. Users of the object should not change this property; instead, the parser should subclass LogFileOutput and change the time_format property.

Some logs, regrettably, change time stamps formats across different lines, or change time stamp formats in different versions of the program. In order to accommodate this, the timestamp format can be a list of strptime() format strings. These are combined as alternatives in the regular expression, and are given to strptime in order. These can also be listed as the values of a dict, e.g.:

{'pre_10.1.5': '%y%m%d %H:%M:%S', 'post_10.1.5': '%Y-%m-%d %H:%M:%S'}

Note

Some logs - notably /var/log/messages - do not contain a year in the timestamp. This detected by the absence of a ‘%y’ or ‘%Y’ in the time format. If that year field is absent, the year is assumed to be the year in the given timestamp being sought. Some attempt is made to handle logs with a rollover from December to January, by finding when the log’s timestamp (with current year assumed) is over eleven months (specifically, 330 days) ahead of or behind the timestamp date and shifting that log date by 365 days so that it is more likely to be in the sought range. This paragraph is sponsored by syslog.

Parameters
  • timestamp (datetime.datetime) -- lines before this time are ignored.

  • s (str or list) -- one or more strings to search for. If not supplied, all available lines are searched.

Yields

dict -- The parsed lines with timestamps after this date in the same format they were supplied. It at least contains the raw_message as a key.

Raises

ParseException -- If the format conversion string contains a format that we don’t recognise. In particular, no attempt is made to recognise or parse the time zone or other obscure values like day of year or week of year.

classmethod keep_scan(result_key, token, check=<built-in function all>, num=None, reverse=False)[source]

Define a property that is set to the list of dictionaries of the lines that contain the given token. Uses the get method of the log file.

Parameters
  • result_key (str) -- the scanner key to register

  • token (str or list) -- one or more strings to search for

  • check (func) -- built-in function all or any applied to each line

  • num (int) -- the number of lines to get, None for unlimited

  • reverse (bool) -- scan start from the head when False by default, otherwise start from the tail

Returns

list of dictionaries corresponding to the parsed lines contain the token.

Return type

(list)

classmethod last_scan(result_key, token, check=<built-in function all>)[source]

Define a property that is set to the dictionary of the last line that contains the given token. Uses the get method of the log file.

Parameters
  • result_key (str) -- the scanner key to register

  • token (str or list) -- one or more strings to search for

  • check (func) -- built-in function all or any applied to each line

Returns

dictionary corresponding to the last parsed line contains the token.

Return type

(dict)

parse_content(content)[source]

Use all the defined scanners to search the log file, setting the properties defined in the scanner.

classmethod scan(result_key, func)[source]

Define computed fields based on a string to “grep for”. This is preferred to utilizing raw log lines in plugins because computed fields will be serialized, whereas raw log lines will not.

Raises

ValueError -- When result_key is already a registered scanner key.

scanner_keys = {}
scanners = []
time_format = '%Y-%m-%d %H:%M:%S'

The timestamp format assumed for the log files. A subclass can override this for files that have a different timestamp format. This can be:

  • A string in strptime() format.

  • A list of strptime() strings.

  • A dictionary with each item’s value being a strptime() string. This allows the item keys to provide some form of documentation.

classmethod token_scan(result_key, token, check=<built-in function all>)[source]

Define a property that is set to true if the given token is found in the log file. Uses the __contains__ method of the log file.

Parameters
  • result_key (str) -- the scanner key to register

  • token (str or list) -- one or more strings to search for

  • check (func) -- built-in function all or any applied to each line

Returns

the property will contain True if a line contained (any or all) of the tokens given.

Return type

(bool)

class insights.core.Parser(context)[source]

Bases: object

Base class designed to be subclassed by parsers.

The framework will construct your object with a Context that will provide at least the content as an interable of lines and the path that the content was retrieved from.

Facts should be exposed as instance members where applicable. For example:

self.fact = "123"

Examples

>>> class MyParser(Parser):
...     def parse_content(self, content):
...         self.facts = []
...         for line in content:
...             if 'fact' in line:
...                 self.facts.append(line)
>>> content = '''
... # Comment line
... fact=fact 1
... fact=fact 2
... fact=fact 3
... '''.strip()
>>> my_parser = MyParser(context_wrap(content, path='/etc/path_to_content/content.conf'))
>>> my_parser.facts
['fact=fact 1', 'fact=fact 2', 'fact=fact 3']
>>> my_parser.file_path
'/etc/path_to_content/content.conf'
>>> my_parser.file_name
'content.conf'
file_name

Filename portion of the input file.

Type

str

file_path

Full context path of the input file.

Type

str

parse_content(content)[source]

This method must be implemented by classes based on this class.

class insights.core.ScanMeta(name, parents, dct)[source]

Bases: type

class insights.core.Scannable(*args, **kwargs)[source]

Bases: Parser

A class to enable early and easy collection of data in a file.

The Scannable class makes it easy to collect two common types of information from a data file:

  • A flag to indicate that the data contains one or more lines with a given string.

  • a list of lines containing a given string.

To create a parser from the Scannable parser class, the main job is to override the parse() method, returning your choice of data structure to represent the information in the file. This takes the form of a generator that yields structures for users of your parser. You can yield more than object per line, or you can condense multiple lines into one object. Each object is then scanned with all the defined scanners for this class.

How does that work? Well, the individual rules using your parser will use the any() and collect() methods on the class object itself to set up new attributes of the class that will be given values based on the results of a function that checks each object from your parser for the properties it’s looking for. That’s pretty vague, so let’s give some examples - imagine a parser defined as:

class AnacondaLog(Scannable):

pass

(Which uses the default parse() function that simply yields each line in turn.) A rule using this parser then does:

def warnings(line):

return line if ‘WARNING’ in line

def has_fcoe_edd(line):

return ‘/usr/libexec/fcoe/fcoe_edd.sh’ in line

AnacondaLog.any(‘has_fcoe’, has_fcoe_edd) AnacondaLog.collect(‘warnings’, warnings)

These then act in the following way:

  • When an object is instantiated from the AnacondaLog class, it will have the ‘has_fcoe’ attribute. This will be set to True if ‘/usr/libexec/fcoe/fcoe_edd.sh’ was found in any line in the file, or False otherwise.

  • When an object is instantiated from the AnacondaLog class, it will have the ‘warnings’ attribute. This will be a list containing all the lines found.

Users of your class can supply any function to either any() or collect(). Functions given to collect() can return anything they want to be collected - if they return something that evaluates to False then nothing is collected (so avoid returning empty lists, empty dicts, empty strings or False).

classmethod any(result_key, func)[source]

Sets the result_key to the output of func if func ever returns truthy

classmethod collect(result_key, func)[source]

Sets the result_key to an iterable of objects for which func(obj) returns True

parse(content)[source]

Default ‘parsing’ method. Subclasses should override this method with their own custom parsing as necessary.

parse_content(content)[source]

This method must be implemented by classes based on this class.

scanner_keys = {}
scanners = []
class insights.core.StreamParser(context)[source]

Bases: Parser

Parsers that don’t have to store lines or look back in the data stream should implement StreamParser instead of Parser as it is more memory efficient. The only difference between StreamParser and Parser is that StreamParser.parse_content will receive a generator instead of a list.

class insights.core.SysconfigOptions(context)[source]

Bases: Parser, LegacyItemAccess

A parser to handle the standard ‘keyword=value’ format of files in the /etc/sysconfig directory. These are provided in the standard ‘data’ dictionary.

Examples

>>> 'OPTIONS' in ntpconf
True
>>> 'NOT_SET' in ntpconf
False
>>> 'COMMENTED_OUT' in ntpconf
False
>>> ntpconf['OPTIONS']
'-x -g'

For common variables such as OPTIONS, it is recommended to set a specific property in the subclass that fetches this option with a fallback to a default value.

Example subclass:

class DirsrvSysconfig(SysconfigOptions):

    @property
    def options(self):
        return self.data.get('OPTIONS', '')
keys()[source]

Return the list of keys (in no order) in the underlying dictionary.

parse_content(content)[source]

This method must be implemented by classes based on this class.

class insights.core.Syslog(context)[source]

Bases: LogFileOutput

Class for parsing syslog file content.

The important function is get(s)(), which finds all lines with the string s and parses them into dictionaries with the following keys:

  • timestamp - the time the log line was written

  • procname - the process or facility that wrote the line

  • hostname - the host that generated the log line

  • message - the rest of the message (after the process name)

  • raw_message - the raw message before being split.

It is best to use filters and/or scanners with the messages log, to speed up parsing. These work on the raw message, before being parsed.

Sample log lines:

May  9 15:13:34 lxc-rhel68-sat56 jabberd/sm[11057]: session started: jid=rhn-dispatcher-sat@lxc-rhel6-sat56.redhat.com/superclient
May  9 15:13:36 lxc-rhel68-sat56 wrapper[11375]: --> Wrapper Started as Daemon
May  9 15:13:36 lxc-rhel68-sat56 wrapper[11375]: Launching a JVM...
May 10 15:24:28 lxc-rhel68-sat56 yum[11597]: Installed: lynx-2.8.6-27.el6.x86_64
May 10 15:36:19 lxc-rhel68-sat56 yum[11954]: Updated: sos-3.2-40.el6.noarch

Examples

>>> Syslog.token_scan('daemon_start', 'Wrapper Started as Daemon')
>>> Syslog.token_scan('yum_updated', ['yum', 'Updated'])
>>> Syslog.keep_scan('yum_lines', 'yum')
>>> Syslog.keep_scan('yum_installed_lines', ['yum', 'Installed'])
>>> syslog.get('wrapper')[0]
{'timestamp': 'May  9 15:13:36', 'hostname': 'lxc-rhel68-sat56',
 'procname': wrapper[11375]', 'message': '--> Wrapper Started as Daemon',
 'raw_message': 'May  9 15:13:36 lxc-rhel68-sat56 wrapper[11375]: --> Wrapper Started as Daemon'
}
>>> syslog.daemon_start
True
>>> syslog.yum_updated
True
>>> len(syslog.yum_lines)
2
>>> len(syslog.yum_updated_lines)
1

Note

Because syslog timestamps by default have no year, the year of the logs will be inferred from the year in your timestamp. This will also work around December/January crossovers.

get_logs_by_procname(proc)[source]
Parameters

proc (str) -- The process or facility that you’re looking for

Yields

(dict) -- The parsed syslog messages produced by that process or facility

scanner_keys = {}
scanners = []
time_format = '%b %d %H:%M:%S'

The timestamp format assumed for the log files. A subclass can override this for files that have a different timestamp format. This can be:

  • A string in strptime() format.

  • A list of strptime() strings.

  • A dictionary with each item’s value being a strptime() string. This allows the item keys to provide some form of documentation.

class insights.core.XMLParser(context)[source]

Bases: LegacyItemAccess, Parser

A parser class that reads XML files. Base your own parser on this.

Examples

>>> content = '''
... <?xml version="1.0"?>
... <data xmlns:fictional="http://characters.example.com"
...       xmlns="http://people.example.com">
...     <country name="Liechtenstein">
...         <rank updated="yes">2</rank>
...         <year>2008</year>
...         <gdppc>141100</gdppc>
...         <neighbor name="Austria" direction="E"/>
...         <neighbor name="Switzerland" direction="W"/>
...     </country>
...     <country name="Singapore">
...         <rank updated="yes">5</rank>
...         <year>2011</year>
...         <gdppc>59900</gdppc>
...         <neighbor name="Malaysia" direction="N"/>
...     </country>
...     <country name="Panama">
...         <rank>68</rank>
...         <year>2011</year>
...         <gdppc>13600</gdppc>
...         <neighbor name="Costa Rica" direction="W"/>
...     </country>
... </data>
... '''.strip()
>>> xml_parser = XMLParser(context_wrap(content))
>>> xml_parser.xmlns
'http://people.example.com'
>>> xml_parser.get_elements(".")[0].tag # Top-level elements
'data'
>>> len(xml_parser.get_elements("./country/neighbor", None)) # All 'neighbor' grand-children of 'country' children of the top-level elements
3
>>> len(xml_parser.get_elements(".//year/..[@name='Singapore']")[0]) # Nodes with name='Singapore' that have a 'year' child
1
>>> xml_parser.get_elements(".//*[@name='Singapore']/year")[0].text # 'year' nodes that are children of nodes with name='Singapore'
'2011'
>>> xml_parser.get_elements(".//neighbor[2]", "http://people.example.com")[0].get('name') # All 'neighbor' nodes that are the second child of their parent
'Switzerland'
raw

raw XML content

Type

str

dom

Root element of parsed XML file

Type

Element

xmlns

The default XML namespace, an empty string when no namespace is declared.

Type

str

data

All required specific properties can be included in data.

Type

dict

get_elements(element, xmlns=None)[source]

Return a list of elements those match the searching condition. If the XML input has namespaces, elements and attributes with prefixes in the form prefix:sometag get expanded to {namespace}element where the prefix is replaced by the full URI. Also, if there is a default namespace, that full URI gets prepended to all of the non-prefixed tags. Element names can contain letters, digits, hyphens, underscores, and periods. But element names must start with a letter or underscore. Here the while-clause is to set searching condition from /element1/element2 to /{namespace}element1/{namespace}/element2

Parameters
  • element -- Searching condition to search certain elements in an XML file. For more details about how to set searching condition, refer to section 19.7.2.1. Example and 19.7.2.2. Supported XPath syntax in https://docs.python.org/2/library/xml.etree.elementtree.html

  • xmlns -- XML namespace, default value to None. None means that xmlns equals to the self.xmlns (default namespace) instead of “” all the time. Only string type parameter (including “”) will be regarded as a valid xml namespace.

Returns

List of elements those match the searching condition

Return type

(list)

parse_content(content)[source]

All child classes inherit this function to parse XML file automatically. It will call the function parse_dom() by default to parser all necessary data to data and the xmlns (the default namespace) is ready for this function.

parse_dom()[source]

If self.data is required, all child classes need to overwrite this function to set it

class insights.core.YAMLParser(context)[source]

Bases: Parser, LegacyItemAccess

A parser class that reads YAML files. Base your own parser on this.

parse_content(content)[source]

This method must be implemented by classes based on this class.

insights.core.default_parser_deserializer(_type, data)[source]
insights.core.default_parser_serializer(obj)[source]
insights.core.flatten(docs, pred)[source]

Replace include nodes with their config trees. Allows the same files to be included more than once so long as they don’t induce a recursion.

insights.core.context

class insights.core.context.ClusterArchiveContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

class insights.core.context.Context(**kwargs)[source]

Bases: object

product()[source]
stream()[source]
class insights.core.context.Docker(role=None)[source]

Bases: MultiNodeProduct

name = 'docker'
parent_type = 'host'
class insights.core.context.DockerImageContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

class insights.core.context.ExecutionContext(root='/', timeout=None, all_files=None)[source]

Bases: object

check_output(cmd, timeout=None, keep_rc=False, env=None, signum=None)[source]

Subclasses can override to provide special environment setup, command prefixes, etc.

connect(*args, **kwargs)[source]
classmethod handles(files)[source]
locate_path(path)[source]
marker = None
shell_out(cmd, split=True, timeout=None, keep_rc=False, env=None, signum=None)[source]
stream(*args, **kwargs)[source]
class insights.core.context.ExecutionContextMeta(name, bases, dct)[source]

Bases: type

classmethod identify(files)[source]
registry = [<class 'insights.core.context.HostContext'>, <class 'insights.core.context.HostArchiveContext'>, <class 'insights.core.context.SerializedArchiveContext'>, <class 'insights.core.context.SosArchiveContext'>, <class 'insights.core.context.ClusterArchiveContext'>, <class 'insights.core.context.DockerImageContext'>, <class 'insights.core.context.JBossContext'>, <class 'insights.core.context.JDRContext'>, <class 'insights.core.context.InsightsOperatorContext'>, <class 'insights.core.context.MustGatherContext'>, <class 'insights.core.context.OpenStackContext'>]
class insights.core.context.HostArchiveContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

marker = 'insights_commands'
class insights.core.context.HostContext(root='/', timeout=30, all_files=None)[source]

Bases: ExecutionContext

class insights.core.context.InsightsOperatorContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

Recognizes insights-operator archives

marker = 'config/featuregate'
class insights.core.context.JBossContext(root='/', timeout=30, all_files=None)[source]

Bases: HostContext

class insights.core.context.JDRContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

locate_path(path)[source]
marker = 'JBOSS_HOME'
class insights.core.context.MultiNodeProduct(role=None)[source]

Bases: object

is_parent()[source]
class insights.core.context.MustGatherContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

Recognizes must-gather archives

marker = 'cluster-scoped-resources'
class insights.core.context.OSP(role=None)[source]

Bases: MultiNodeProduct

name = 'osp'
parent_type = 'Director'
class insights.core.context.OpenStackContext(hostname)[source]

Bases: ExecutionContext

class insights.core.context.RHEL(version=['-1', '-1'], release=None)[source]

Bases: object

classmethod from_metadata(metadata, processor_obj)[source]
name = 'rhel'
class insights.core.context.RHEV(role=None)[source]

Bases: MultiNodeProduct

name = 'rhev'
parent_type = 'Manager'
class insights.core.context.SerializedArchiveContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

marker = 'insights_archive.txt'
class insights.core.context.SosArchiveContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

marker = 'sos_commands'
insights.core.context.create_product(metadata, hostname)[source]
insights.core.context.fs_root(thing)[source]
insights.core.context.get_system(metadata, hostname)[source]
insights.core.context.product(klass)[source]

insights.core.dr

This module implements an inversion of control framework. It allows dependencies among functions and classes to be declared with decorators and the resulting dependency graphs to be executed.

A decorator used to declare dependencies is called a ComponentType, a decorated function or class is called a component, and a collection of interdependent components is called a graph.

In the example below, needs is a ComponentType, one, two, and add are components, and the relationship formed by their dependencies is a graph.

from insights import dr

class needs(dr.ComponentType):
    pass

@needs()
def one():
    return 1

@needs()
def two():
    return 2

@needs(one, two)
def add(a, b):
    return a + b

results = dr.run(add)

Once all components have been imported, the graphs they form can be run. To execute a graph, dr sorts its components into an order that guarantees dependencies are tried before dependents. Components that raise exceptions are considered invalid, and their dependents will not be executed. If a component is skipped because of a missing dependency, its dependents also will not be executed.

During evaluation, results are accumulated into an object called a Broker, which is just a fancy dictionary. Brokers can be inspected after a run for results, exceptions, tracebacks, and execution times. You also can register callbacks with a broker that get invoked after the attempted execution of every component, so you can inspect it during an evaluation instead of at the end.

class insights.core.dr.Broker(seed_broker=None)[source]

Bases: object

The Broker is a fancy dictionary that keeps up with component instances as a graph is evaluated. It’s the state of the evaluation. Once a graph has executed, the broker will contain everything about the evaluation: component instances, timings, exceptions, and tracebacks.

You can either inspect the broker at the end of an evaluation, or you can register callbacks with it, and they’ll get invoked after each component is called.

instances

the component instances with components as keys.

Type

dict

missing_requirements

components that didn’t have their dependencies met. Values are a two-tuple. The first element is the list of required dependencies that were missing. The second element is the list of “at least one” dependencies that were missing. For more information on dependency types, see the ComponentType docs.

Type

dict

exceptions

Components that raise any type of exception except SkipComponent during evaluation. The key is the component, and the value is a list of exceptions. It’s a list because some components produce multiple instances.

Type

defaultdict(list)

tracebacks

keys are exceptions and values are their text tracebacks.

Type

dict

exec_times

component -> float dictionary where values are the number of seconds the component took to execute. Calculated using time.time(). For components that produce multiple instances, the execution time here is the sum of their individual execution times.

Type

dict

add_exception(component, ex, tb=None)[source]
add_observer(o, component_type=<class 'insights.core.dr.ComponentType'>)[source]

Add a callback that will get invoked after each component is called.

Parameters

o (func) -- the callback function

Keyword Arguments

component_type (ComponentType) -- the ComponentType to observe. The callback will fire any time an instance of the class or its subclasses is invoked.

The callback should look like this:

def callback(comp, broker):
    value = broker.get(comp)
    # do something with value
    pass
fire_observers(component)[source]
get(component, default=None)[source]
get_by_type(_type)[source]

Return all of the instances of ComponentType _type.

items()[source]
keys()[source]
observer(component_type=<class 'insights.core.dr.ComponentType'>)[source]

You can use @broker.observer() as a decorator to your callback instead of Broker.add_observer().

print_component(component_type)[source]
values()[source]
insights.core.dr.add_dependency(component, dep)[source]
insights.core.dr.add_dependent(component, dep)[source]
insights.core.dr.add_ignore(c, i)[source]
insights.core.dr.add_observer(o, component_type=<class 'insights.core.dr.ComponentType'>)[source]

Add a callback that will get invoked after each component is called.

Parameters

o (func) -- the callback function

Keyword Arguments

component_type (ComponentType) -- the ComponentType to observe. The callback will fire any time an instance of the class or its subclasses is invoked.

The callback should look like this:

def callback(comp, broker):
    value = broker.get(comp)
    # do something with value
    pass
insights.core.dr.determine_components(components)[source]
insights.core.dr.first_of(dependencies, broker)[source]
insights.core.dr.generate_incremental(components=None, broker=None)[source]
insights.core.dr.get_base_module_name(obj)[source]
insights.core.dr.get_component(name)[source]

Returns the class or function specified, importing it if necessary.

insights.core.dr.get_component_by_name(name)[source]

Look up a component by its fully qualified name. Return None if the component hasn’t been loaded.

insights.core.dr.get_component_type(component)[source]
insights.core.dr.get_components_of_type(_type)[source]
insights.core.dr.get_delegate(component)[source]
insights.core.dr.get_dependencies(component)[source]
insights.core.dr.get_dependency_graph(component)[source]

Generate a component’s graph of dependencies, which can be passed to run() or run_incremental().

insights.core.dr.get_dependency_specs(component)[source]

Get the dependency specs of the specified component. Only requires and at_least_one specs will be returned. The optional specs is not considered in this function.

Parameters

component (callable) -- The component to check. The component must already be loaded.

Returns

The requires and at_least_one spec sets of the component.

Return type

list

The return list is in the following format:

 [
     requires_1,
     requires_2,
     (at_least_one_11, at_least_one_12),
     (at_least_one_21, [req_alo22, (alo_23, alo_24)]),
 ]

Note:
 - The 'requires_1' and 'requires_2' are `requires` specs.
   Each of them are required.
 - The 'at_least_one_11' and 'at_least_one_12' are `at_least_one`
   specs in the same at least one set.
   At least one of them is required
 - The 'alo_23' and 'alo_24' are `at_least_one` specs and
   together with the 'req_alo22' are `requires` for the
   sub-set. This sub-set specs and the 'at_least_one_21' are
   `at_least_one` specs in the same at least one set.
insights.core.dr.get_dependents(component)[source]
insights.core.dr.get_group(component)[source]

Return the dictionary of links associated with the component. Defaults to dict().

insights.core.dr.get_metadata(component)[source]

Return any metadata dictionary associated with the component. Defaults to an empty dictionary.

insights.core.dr.get_missing_requirements(func, requires, d)[source]

Deprecated since version 1.x.

insights.core.dr.get_module_name(obj)[source]
insights.core.dr.get_name(component)[source]

Attempt to get the string name of component, including module and class if applicable.

insights.core.dr.get_registry_points(component)[source]

Loop through the dependency graph to identify the corresponding spec registry points for the component. This is primarily used by datasources and returns a set. In most cases only one registry point will be included in the set, but in some cases more than one.

Parameters

component (callable) -- The component object

Returns

A list of the registry points found.

Return type

(set)

insights.core.dr.get_simple_name(component)[source]
insights.core.dr.get_subgraphs(graph=None)[source]

Given a graph of possibly disconnected components, generate all graphs of connected components. graph is a dictionary of dependencies. Keys are components, and values are sets of components on which they depend.

insights.core.dr.get_tags(component)[source]

Return the set of tags associated with the component. Defaults to set().

insights.core.dr.hashable(v)[source]
insights.core.dr.is_enabled(component)[source]

Check to see if a component is enabled.

Parameters

component (callable) -- The component to check. The component must already be loaded.

Returns

True if the component is enabled. False otherwise.

insights.core.dr.is_hidden(component)[source]
insights.core.dr.is_registry_point(component)[source]
insights.core.dr.load_components(*paths, **kwargs)[source]

Loads all components on the paths. Each path should be a package or module. All components beneath a path are loaded.

Parameters

paths (str) -- A package or module to load

Keyword Arguments
  • include (str) -- A regular expression of packages and modules to include. Defaults to ‘.*’

  • exclude (str) -- A regular expression of packges and modules to exclude. Defaults to ‘test’

  • continue_on_error (bool) -- If True, continue importing even if something raises an ImportError. If False, raise the first ImportError.

Returns

The total number of modules loaded.

Return type

int

Raises

ImportError --

insights.core.dr.mark_hidden(component)[source]
insights.core.dr.observer(component_type=<class 'insights.core.dr.ComponentType'>)[source]

You can use @broker.observer() as a decorator to your callback instead of add_observer().

insights.core.dr.run(components=None, broker=None)[source]

Executes components in an order that satisfies their dependency relationships.

Keyword Arguments
  • components -- Can be one of a dependency graph, a single component, a component group, or a component type. If it’s anything other than a dependency graph, the appropriate graph is built for you and before evaluation.

  • broker (Broker) -- Optionally pass a broker to use for evaluation. One is created by default, but it’s often useful to seed a broker with an initial dependency.

Returns

The broker after evaluation.

Return type

Broker

insights.core.dr.run_all(components=None, broker=None, pool=None)[source]
insights.core.dr.run_components(ordered_components, components, broker)[source]

Runs a list of preordered components using the provided broker.

This function allows callers to order components themselves and cache the result so they don’t incur the toposort overhead on every run.

insights.core.dr.run_incremental(components=None, broker=None)[source]

Executes components in an order that satisfies their dependency relationships. Disjoint subgraphs are executed one at a time and a broker containing the results for each is yielded. If a broker is passed here, its instances are used to seed the broker used to hold state for each sub graph.

Keyword Arguments
  • components -- Can be one of a dependency graph, a single component, a component group, or a component type. If it’s anything other than a dependency graph, the appropriate graph is built for you and before evaluation.

  • broker (Broker) -- Optionally pass a broker to use for evaluation. One is created by default, but it’s often useful to seed a broker with an initial dependency.

Yields

Broker -- the broker used to evaluate each subgraph.

insights.core.dr.run_order(graph)[source]

Returns components in an order that satisfies their dependency relationships.

insights.core.dr.set_enabled(component, enabled=True)[source]

Enable a component for evaluation. If set to False, the component is skipped, and all components that require it will not execute.

If component is a fully qualified name string of a callable object instead of the callable object itself, the component’s module is loaded as a side effect of calling this function.

Parameters
  • component (str or callable) -- fully qualified name of the component or the component object itself.

  • enabled (bool) -- whether the component is enabled for evaluation.

Returns

None

insights.core.dr.split_requirements(requires)[source]
insights.core.dr.stringify_requirements(requires)[source]
insights.core.dr.walk_dependencies(root, visitor)[source]

Call visitor on root and all dependencies reachable from it in breadth first order.

Parameters
  • root (component) -- component function or class

  • visitor (function) -- signature is func(component, parent). The call on root is visitor(root, None).

insights.core.dr.walk_tree(root, method=<function get_dependencies>)[source]
class insights.core.dr.ComponentType(*deps, **kwargs)[source]

ComponentType is the base class for all component type decorators.

For Example:

class my_component_type(ComponentType):
    pass

@my_component_type(SshDConfig, InstalledRpms, [ChkConfig, UnitFiles], optional=[IPTables, IpAddr])
def my_func(sshd_config, installed_rpms, chk_config, unit_files, ip_tables, ip_addr):
    return installed_rpms.newest("bash")

Notice that the arguments to my_func correspond to the dependencies in the @my_component_type and are in the same order.

When used, a my_component_type instance is created whose __init__ gets passed dependencies and whose __call__ gets passed the component to run if dependencies are met.

Parameters to the decorator have these forms:

Criteria

Example Decorator Arguments

Description

Required

SshDConfig, InstalledRpms

A regular argument

At Least One

[ChkConfig, UnitFiles]

An argument as a list

Optional

optional=[IPTables, IpAddr]

A list following optional=

If a parameter is required, the value provided for it is guaranteed not to be None. In the example above, sshd_config and installed_rpms will not be None.

At least one of the arguments to parameters of an “at least one” list will not be None. In the example, either or both of chk_config and unit_files will not be None.

Any or all arguments for optional parameters may be None.

The following keyword arguments may be passed to the decorator:

requires

a list of components that all components decorated with this type will implicitly require. Additional components passed to the decorator will be appended to this list.

Type

list

optional

a list of components that all components decorated with this type will implicitly depend on optionally. Additional components passed as optional to the decorator will be appended to this list.

Type

list

metadata

an arbitrary dictionary of information to associate with the component you’re decorating. It can be retrieved with get_metadata.

Type

dict

tags

a list of strings that categorize the component. Useful for formatting output or sifting through results for components you care about.

Type

list

group

GROUPS.single or GROUPS.cluster. Used to organize components into “groups” that run together with insights.core.dr.run().

cluster

if True will put the component into the GROUPS.cluster group. Defaults to False. Overrides group if True.

Type

bool

get_missing_dependencies(broker)[source]

Gets required and at-least-one dependencies not provided by the broker.

invoke(results)[source]

Handles invocation of the component. The default implementation invokes it with positional arguments based on order of dependency declaration.

process(broker)[source]

Ensures dependencies have been met before delegating to self.invoke.

insights.core.exceptions

exception insights.core.exceptions.CalledProcessError(returncode, cmd, output=None)[source]

Bases: Exception

Raised if call fails.

Parameters
  • returncode (int) -- The return code of the process executing the command.

  • cmd (str) -- The command that was executed.

  • output (str) -- Any output the command produced.

exception insights.core.exceptions.ContentException[source]

Bases: SkipComponent

Raised whenever a datasource fails to get data.

exception insights.core.exceptions.InvalidArchive(msg)[source]

Bases: Exception

exception insights.core.exceptions.InvalidContentType(content_type)[source]

Bases: InvalidArchive

exception insights.core.exceptions.MissingRequirements(requirements)[source]

Bases: Exception

Raised during evaluation if a component’s dependencies aren’t met.

exception insights.core.exceptions.ParseException[source]

Bases: Exception

Exception that should be thrown from parsers that encounter exceptions they recognize while parsing. When this exception is thrown, the exception message and data are logged and no parser output data is saved.

exception insights.core.exceptions.SkipComponent[source]

Bases: Exception

This class should be raised by components that want to be taken out of dependency resolution.

exception insights.core.exceptions.SkipException[source]

Bases: SkipComponent

Exception that should be thrown from parsers that are explicitly written to look for errors in input data. If the expected error is not found then the parser should throw this exception to signal to the infrastructure that the parser’s output should not be retained.

exception insights.core.exceptions.TimeoutException[source]

Bases: Exception

Raised whenever a datasource hits the set timeout value.

exception insights.core.exceptions.ValidationException(msg, r=None)[source]

Bases: Exception

insights.core.filters

The filters module allows developers to apply filters to datasources, by adding them directly or through dependent components like parsers and combiners. A filter is a simple string, and it matches if it is contained anywhere within a line.

If a datasource has filters defined, it will return only lines matching at least one of them. If a datasource has no filters, it will return all lines.

Filters can be added to components like parsers and combiners, to apply consistent filtering to multiple underlying datasources that are configured as filterable.

Filters aren’t applicable to “raw” datasources, which are created with kind=RawFileProvider and have RegistryPoint instances with raw=True.

The addition of a single filter can cause a datasource to change from returning all lines to returning just those that match. Therefore, any filtered datasource should have at least one filter in the commit introducing it so downstream components don’t inadvertently change its behavior.

The benefit of this fragility is the ability to drastically reduce in-memory footprint and archive sizes. An additional benefit is the ability to evaluate only lines known to be free of sensitive information.

Filters added to a RegistryPoint will be applied to all datasources that implement it. Filters added to a datasource implementation apply only to that implementation.

For example, a filter added to Specs.ps_auxww will apply to DefaultSpecs.ps_auxww, InsightsArchiveSpecs.ps_auxww, SosSpecs.ps_auxww, etc. But a filter added to DefaultSpecs.ps_auxww will only apply to DefaultSpecs.ps_auxww. See the modules in insights.specs for those classes.

Filtering can be disabled globally by setting the environment variable INSIGHTS_FILTERS_ENABLED=False. This means that no datasources will be filtered even if filters are defined for them.

insights.core.filters.add_filter(component, patterns)[source]

Add a filter or list of filters to a component. When the component is a datasource, the filter will be directly added to that datasouce. In cases when the component is a parser or combiner, the filter will be added to underlying filterable datasources by traversing dependency graph. A filter is a simple string, and it matches if it is contained anywhere within a line.

Parameters
  • component (component) -- The component to filter, can be datasource, parser or combiner.

  • patterns (str, [str]) -- A string, list of strings, or set of strings to add to the datasource’s filters.

insights.core.filters.apply_filters(target, lines)[source]

Applys filters to the lines of a datasource. This function is used only in integration tests. Filters are applied in an equivalent but more performant way at run time.

insights.core.filters.dump(stream=None)[source]

Dumps a string representation of FILTERS to a stream, normally an open file. If none is passed, FILTERS is dumped to a default location within the project.

insights.core.filters.dumps()[source]

Returns a string representation of the FILTERS dictionary.

insights.core.filters.get_filters(component)[source]

Get the set of filters for the given datasource.

Filters added to a RegistryPoint will be applied to all datasources that implement it. Filters added to a datasource implementation apply only to that implementation.

For example, a filter added to Specs.ps_auxww will apply to DefaultSpecs.ps_auxww, InsightsArchiveSpecs.ps_auxww, SosSpecs.ps_auxww, etc. But a filter added to DefaultSpecs.ps_auxww will only apply to DefaultSpecs.ps_auxww. See the modules in insights.specs for those classes.

Parameters

component (a datasource) -- The target datasource

Returns

The set of filters defined for the datasource

Return type

set

insights.core.filters.load(stream=None)[source]

Loads filters from a stream, normally an open file. If one is not passed, filters are loaded from a default location within the project.

insights.core.filters.loads(string)[source]

Loads the filters dictionary given a string.

insights.core.plugins

The plugins module defines the components used by the rest of insights and specializes their interfaces and execution model where required.

This module includes the following CompoentType subclasses:

It also contains the following Response subclasses that rules may return:

class insights.core.plugins.PluginType(*deps, **kwargs)[source]

Bases: ComponentType

PluginType is the base class of plugin types like datasource, rule, etc. It provides a default invoke method that catches exceptions we don’t want bubbling to the top of the evaluation loop. These exceptions are commonly raised by datasource components but could be in the context of any component since most datasource runtime errors are lazy.

It’s possible for a datasource to “succeed” and return an object but for an exception to be raised when the parser tries to access the content of that object. For example, when a command datasource is evaluated, it only checks that the command exists and is executable. Invocation of the command itself is delayed until the parser asks for its value. This helps with performance and memory consumption.

invoke(broker)[source]

Handles invocation of the component. The default implementation invokes it with positional arguments based on order of dependency declaration.

class insights.core.plugins.Response(key, **kwargs)[source]

Bases: dict

Response is the base class of response types that can be returned from rules.

Subclasses must call __init__ of this class via super() and must provide the response_type class attribute.

The key_name class attribute is optional, but if one is specified, the first argument to __init__ must not be None. If key_name is None, then the first argument to __init__ should be None. It’s best to override __init__ in subclasses so users aren’t required to pass None explicitly.

adjust_for_length(key, r, kwargs)[source]

Converts the response to a string and compares its length to a max length specified in settings. If the response is too long, an error is logged, and an abbreviated response is returned instead.

get_key()[source]

Helper function that uses the response’s key_name to look up the response identifier. For a rule, this is like response.get(“error_key”).

key_name = None

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = None

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

validate_key(key)[source]

Called if the key_name class attribute is not None.

validate_kwargs(kwargs)[source]

Validates expected subclass attributes and constructor keyword arguments.

class insights.core.plugins.combiner(*deps, **kwargs)[source]

Bases: PluginType

A decorator for a component that composes or “combines” other components.

A typical use case is hiding slight variations in related parser interfaces. Another use case is to combine several related parsers behind a single, cohesive, higher level interface.

class insights.core.plugins.component(*deps, **kwargs)[source]

Bases: PluginType

class insights.core.plugins.condition(*deps, **kwargs)[source]

Bases: PluginType

ComponentType used to encapsulate boolean logic you’d like to have analyzed by a rule analysis system. Conditions should return truthy values. None is also a valid return type for conditions, so rules that depend on conditions that might return None should check their validity.

class insights.core.plugins.datasource(*deps, **kwargs)[source]

Bases: PluginType

Decorates a component that one or more insights.core.Parser subclasses will consume.

filterable = False
invoke(broker)[source]

Handles invocation of the component. The default implementation invokes it with positional arguments based on order of dependency declaration.

multi_output = False
raw = False
class insights.core.plugins.fact(*deps, **kwargs)[source]

Bases: PluginType

ComponentType for a component that surfaces a dictionary or list of dictionaries that will be used later by cluster rules. The data from a fact is converted to a pandas Dataframe

class insights.core.plugins.incident(*deps, **kwargs)[source]

Bases: PluginType

ComponentType for a component used by rules that allows automated statistical analysis.

insights.core.plugins.is_combiner(component)[source]
insights.core.plugins.is_component(obj)[source]
insights.core.plugins.is_datasource(component)[source]
insights.core.plugins.is_parser(component)[source]
insights.core.plugins.is_rule(component)[source]
insights.core.plugins.is_type(component, _type)[source]
class insights.core.plugins.make_fail(key, **kwargs)[source]

Bases: make_response

Returned by a rule to signal that its conditions have been met.

Example:

# completely made up package
buggy = InstalledRpms.from_package("bash-3.4.23-1.el7")

@rule(InstalledRpms)
def report(installed_rpms):
   bash = installed_rpms.newest("bash")
   if bash == buggy:
       return make_fail("BASH_BUG_123", bash=bash)
   return make_pass("BASH", bash=bash)
class insights.core.plugins.make_fingerprint(key, **kwargs)[source]

Bases: Response

key_name = 'fingerprint_key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'fingerprint'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_info(key, **kwargs)[source]

Bases: Response

Returned by a rule to surface information about a system.

Example:

@rule(InstalledRpms)
def report(rpms):
   bash = rpms.newest("bash")
   return make_info("BASH_VERSION", bash=bash.nvra)
key_name = 'info_key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'info'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_metadata(**kwargs)[source]

Bases: Response

Allows a rule to convey addtional metadata about a system to downstream systems. It doesn’t convey success or failure but purely information that may be aggregated with other make_metadata responses. As such, it has no response key.

response_type = 'metadata'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_metadata_key(key, value)[source]

Bases: Response

adjust_for_length(key, r, kwargs)[source]

Converts the response to a string and compares its length to a max length specified in settings. If the response is too long, an error is logged, and an abbreviated response is returned instead.

key_name = 'key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'metadata_key'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_none[source]

Bases: Response

Used to create a response for a rule that returns None

This is not intended to be used by plugins, only infrastructure but it not private so that we can easily add it to reporting.

key_name = 'none_key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'none'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_pass(key, **kwargs)[source]

Bases: Response

Returned by a rule to signal that its conditions explicitly have not been met. In other words, the rule has all of the information it needs to determine that the system it’s analyzing is not in the state the rule was meant to catch.

An example rule might check whether a system is vulnerable to a well defined exploit or has a bug in a specific version of a package. If it can say for sure “the system does not have this exploit” or “the system does not have the buggy version of the package installed”, then it should return an instance of make_pass.

Example:

# completely made up package
buggy = InstalledRpms.from_package("bash-3.4.23-1.el7")

@rule(InstalledRpms)
def report(installed_rpms):
   bash = installed_rpms.newest("bash")
   if bash == buggy:
       return make_fail("BASH_BUG_123", bash=bash)
   return make_pass("BASH", bash=bash)
key_name = 'pass_key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'pass'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_response(key, **kwargs)[source]

Bases: Response

Returned by a rule to signal that its conditions have been met.

Example:

# completely made up package
buggy = InstalledRpms.from_package("bash-3.4.23-1.el7")

@rule(InstalledRpms)
def report(installed_rpms):
   bash = installed_rpms.newest("bash")
   if bash == buggy:
       return make_response("BASH_BUG_123", bash=bash)
   return make_pass("BASH", bash=bash)

Deprecated since version Use: make_fail instead.

key_name = 'error_key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'rule'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.metadata(*args, **kwargs)[source]

Bases: parser

Used for old cluster uber-archives.

Deprecated since version 1.x.

Warning

Do not use this component type.

requires = ['metadata.json']

a list of components that all components decorated with this type will implicitly require. Additional components passed to the decorator will be appended to this list.

class insights.core.plugins.parser(*args, **kwargs)[source]

Bases: PluginType

Decorates a component responsible for parsing the output of a datasource. @parser should accept multiple arguments, the first will ALWAYS be the datasource the parser component should handle. Any subsequent argument will be a component used to determine if the parser should fire. @parser should only decorate subclasses of insights.core.Parser.

Warning

If a Parser component handles a datasource that returns a list, a Parser instance will be created for each element of the list. Combiners or rules that depend on the Parser will be passed the list of instances and not a single parser instance. By default, if any parser in the list succeeds, those parsers are passed on to dependents, even if others fail. If all parsers should succeed or fail together, pass continue_on_error=False.

invoke(broker)[source]

Handles invocation of the component. The default implementation invokes it with positional arguments based on order of dependency declaration.

class insights.core.plugins.remoteresource(*deps, **kwargs)[source]

Bases: PluginType

ComponentType for a component for remote web resources.

class insights.core.plugins.rule(*args, **kwargs)[source]

Bases: PluginType

Decorator for components that encapsulate some logic that depends on the data model of a system. Rules can depend on datasource instances, parser instances, combiner instances, or anything else.

For example:

@rule(SshDConfig, InstalledRpms, [ChkConfig, UnitFiles], optional=[IPTables, IpAddr])
def report(sshd_config, installed_rpms, chk_config, unit_files, ip_tables, ip_addr):
    # ...
    # ... some complicated logic
    # ...
    bash = installed_rpms.newest("bash")
    return make_pass("BASH", bash=bash)

Notice that the arguments to report correspond to the dependencies in the @rule decorator and are in the same order.

Parameters to the decorator have these forms:

Criteria

Example Decorator Arguments

Description

Required

SshDConfig, InstalledRpms

Regular arguments

At Least One

[ChkConfig, UnitFiles]

An argument as a list

Optional

optional=[IPTables, IpAddr]

A list following optional=

If a parameter is required, the value provided for it is guaranteed not to be None. In the example above, sshd_config and installed_rpms will not be None.

At least one of the arguments to parameters of an “at least one” list will not be None. In the example, either or both of chk_config and unit_files will not be None.

Any or all arguments for optional parameters may be None.

The following keyword arguments may be passed to the decorator:

Keyword Arguments
  • requires (list) -- a list of components that all components decorated with this type will require. Instead of using requires=[...], just pass dependencies as variable arguments to @rule as in the example above.

  • optional (list) -- a list of components that all components decorated with this type will implicitly depend on optionally. Additional components passed as optional to the decorator will be appended to this list.

  • metadata (dict) -- an arbitrary dictionary of information to associate with the component you’re decorating. It can be retrieved with get_metadata.

  • tags (list) -- a list of strings that categorize the component. Useful for formatting output or sifting through results for components you care about.

  • group -- GROUPS.single or GROUPS.cluster. Used to organize components into “groups” that run together with insights.core.dr.run().

  • cluster (bool) -- if True will put the component into the GROUPS.cluster group. Defaults to False. Overrides group if True.

  • content (string or dict) -- a jinja2 template or dictionary of jinja2 templates. The Response subclasses rules can return are dictionaries. make_pass, make_fail, and make_response all accept first a key and then a list of arbitrary keyword arguments. If content is a dictionary, the key is used to look up the template that the rest of the keyword argments will be interpolated into. If content is a string, then it is used for all return values of the rule. If content isn’t defined but a CONTENT variable is declared in the module, it will be used for every rule in the module and also can be a string or list of dictionaries

  • links (dict) -- a dictionary with strings as keys and lists of urls as values. The keys categorize the urls, e.g. “kcs” for kcs urls and “bugzilla” for bugzilla urls.

content = None
process(broker)[source]

Ensures dependencies have been met before delegating to self.invoke.

insights.core.remote_resource

class insights.core.remote_resource.CachedRemoteResource[source]

Bases: RemoteResource

RemoteResource subclass that sets up caching for subsequent Web resource requests.

Examples

>>> from insights.core.remote_resource import CachedRemoteResource
>>> crr = CachedRemoteResource()
>>> rtn = crr.get("http://google.com")
>>> print (rtn.content)
backend = 'DictCache'

Type of storage for cache DictCache1, FileCache or RedisCache

Type

str

expire_after = 180

Amount of time in seconds that the cache will expire

Type

float

file_cache_path = '.web_cache'

Path to where file cache will be stored if FileCache backend is specified

Type

str

redis_host = 'localhost'

Hostname of redis instance if RedisCache backend is specified

Type

str

redis_port = 6379

Port used to contact the redis instance if RedisCache backend is specified

Type

int

class insights.core.remote_resource.DefaultHeuristic(expire_after)[source]

Bases: BaseHeuristic

BaseHeuristic subclass that sets the default caching headers if not supplied by the remote service.

default_cache_vars = 'Remote service caching headers not set correctly, using default caching'

Message content warning that the response from the remote server did not return proper HTTP cache headers so we will use default cache settings

Type

str

server_cache_headers = 'Caching being done based on caching headers returned by remote service'

Message content warning that we are using cache settings returned by the remote server.

Type

str

update_headers(response)[source]

Returns the updated caching headers.

Parameters

response (HttpResponse) -- The response from the remote service

Returns

(HttpResponse.Headers): Http caching headers

Return type

response

warning(response)[source]

Return a valid 1xx warning header value describing the cache adjustments.

The response is provided too allow warnings like 113 http://tools.ietf.org/html/rfc7234#section-5.5.4 where we need to explicitly say response is over 24 hours old.

class insights.core.remote_resource.RemoteResource(session=None)[source]

Bases: object

RemoteResource class for accessing external Web resources.

Examples

>>> from insights.core.remote_resource import RemoteResource
>>> rr = RemoteResource()
>>> rtn = rr.get("http://google.com")
>>> print (rtn.content)
get(url, params={}, headers={}, auth=(), certificate_path=None)[source]

Returns the response payload from the request to the given URL.

Parameters
  • url (str) -- The URL for the WEB API that the request is being made too.

  • params (dict) -- Dictionary containing the query string parameters.

  • headers (dict) -- HTTP Headers that may be needed for the request.

  • auth (tuple) -- User ID and password for Basic Auth

  • certificate_path (str) -- Path to the ssl certificate.

Returns

(HttpResponse): Response object from requests.get api request

Return type

response

timeout = 10

Time in seconds for the requests.get api call to wait before returning a timeout exception

Type

float

insights.core.spec_factory

class insights.core.spec_factory.CommandOutputProvider(cmd, ctx, root='insights_commands', args=None, split=True, keep_rc=False, ds=None, timeout=None, inherit_env=None, override_env=None, signum=None)[source]

Bases: ContentProvider

Class used in datasources to return output from commands.

create_args()[source]
create_env()[source]
load()[source]
validate()[source]
write(dst)[source]
class insights.core.spec_factory.ContainerCommandProvider(cmd_path, ctx, image=None, args=None, split=True, keep_rc=False, ds=None, timeout=None, inherit_env=None, override_env=None, signum=None)[source]

Bases: ContainerProvider

class insights.core.spec_factory.ContainerFileProvider(cmd_path, ctx, image=None, args=None, split=True, keep_rc=False, ds=None, timeout=None, inherit_env=None, override_env=None, signum=None)[source]

Bases: ContainerProvider

class insights.core.spec_factory.ContainerProvider(cmd_path, ctx, image=None, args=None, split=True, keep_rc=False, ds=None, timeout=None, inherit_env=None, override_env=None, signum=None)[source]

Bases: CommandOutputProvider

class insights.core.spec_factory.ContentProvider[source]

Bases: object

property content
load()[source]
property path
stream()[source]

Returns a generator of lines instead of a list of lines.

class insights.core.spec_factory.DatasourceProvider(content, relative_path, root='/', ds=None, ctx=None)[source]

Bases: ContentProvider

load()[source]
write(dst)[source]
class insights.core.spec_factory.FileProvider(relative_path, root='/', ds=None, ctx=None)[source]

Bases: ContentProvider

validate()[source]
class insights.core.spec_factory.RawFileProvider(relative_path, root='/', ds=None, ctx=None)[source]

Bases: FileProvider

Class used in datasources that returns the contents of a file a single string. The file is not filtered.

load()[source]
write(dst)[source]
class insights.core.spec_factory.RegistryPoint(metadata=None, multi_output=False, raw=False, filterable=False)[source]

Bases: object

insights.core.spec_factory.SAFE_ENV = {'LANG': 'C.UTF-8', 'LC_ALL': 'C', 'PATH': '/bin:/usr/bin:/sbin:/usr/sbin:/usr/share/Modules/bin'}

A minimal set of environment variables for use in subprocess calls

class insights.core.spec_factory.SerializedOutputProvider(relative_path, root='/', ds=None, ctx=None)[source]

Bases: TextFileProvider

create_args()[source]
class insights.core.spec_factory.SerializedRawOutputProvider(relative_path, root='/', ds=None, ctx=None)[source]

Bases: RawFileProvider

class insights.core.spec_factory.SpecDescriptor(func)[source]

Bases: object

class insights.core.spec_factory.SpecSet[source]

Bases: object

The base class for all spec declarations. Extend this class and define your datasources directly or with a SpecFactory.

context_handlers = {}
registry = {}
class insights.core.spec_factory.SpecSetMeta(name, bases, dct)[source]

Bases: type

The metaclass that converts RegistryPoint markers to registry point datasources and hooks implementations for them into the registry.

class insights.core.spec_factory.TextFileProvider(relative_path, root='/', ds=None, ctx=None)[source]

Bases: FileProvider

Class used in datasources that returns the contents of a file a list of lines. Each line is filtered if filters are defined for the datasource.

create_args()[source]
load()[source]
write(dst)[source]
class insights.core.spec_factory.command_with_args(cmd, provider, context=<class 'insights.core.context.HostContext'>, deps=None, split=True, keep_rc=False, timeout=None, inherit_env=None, override_env=None, signum=None, **kwargs)[source]

Bases: object

Execute a command that has dynamic arguments

Parameters
  • cmd (str) -- the command to execute. Breaking apart a command string that might require arguments.

  • provider (str or tuple) -- argument string or a tuple of argument strings.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • split (bool) -- whether the output of the command should be split into a list of lines

  • keep_rc (bool) -- whether to return the error code returned by the process executing the command. If False, any return code other than zero with raise a CalledProcessError. If True, the return code and output are always returned.

  • timeout (int) -- Number of seconds to wait for the command to complete. If the timeout is reached before the command returns, a CalledProcessError is raised. If None, timeout is infinite.

  • inherit_env (list) -- The list of environment variables to inherit from the calling process when the command is invoked.

  • override_env (dict) -- A dict of environment variables to override from the calling process when the command is invoked.

Returns

A datasource that returns the output of a command that takes

specified arguments passed by the provider.

Return type

function

class insights.core.spec_factory.container_collect(provider, path=None, context=<class 'insights.core.context.HostContext'>, deps=None, split=True, keep_rc=False, timeout=None, inherit_env=None, override_env=None, signum=None, **kwargs)[source]

Bases: foreach_execute

Collects the files at the resulting path in running containers.

Parameters
  • provider (list) -- a list of tuples.

  • path (str) -- the file path template with substitution parameters. The path can also be passed via the provider when it’s variable per cases, in that case, the path should be None.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • keep_rc (bool) -- whether to return the error code returned by the process executing the command. If False, any return code other than zero with raise a CalledProcessError. If True, the return code and output are always returned.

  • timeout (int) -- Number of seconds to wait for the command to complete. If the timeout is reached before the command returns, a CalledProcessError is raised. If None, timeout is infinite.

Returns

A datasource that returns a list of file contents created by

substituting each element of provider into the path template.

Return type

function

class insights.core.spec_factory.container_execute(provider, cmd, context=<class 'insights.core.context.HostContext'>, deps=None, split=True, keep_rc=False, timeout=None, inherit_env=None, override_env=None, signum=None, **kwargs)[source]

Bases: foreach_execute

Execute a command for each element in provider in container. Provider is the output of a different datasource that returns a list of tuples. In each tuple, the container engine provider (“podman” or “docker”) and the container_id are two required elements, the rest elements if there are, are the arguments being passed to the command.

Parameters
  • provider (list) -- a list of tuples, in each tuple, the container engine provider (“podman” or “docker”) and the container_id are two required elements, the rest elements if there are, are the arguments being passed to the cmd.

  • cmd (str) -- a command with substitution parameters. Breaking apart a command string that might contain multiple commands separated by a pipe, getting them ready for subproc operations. IE. A command with filters applied

  • context (ExecutionContext) -- the context under which the datasource should run.

  • split (bool) -- whether the output of the command should be split into a list of lines

  • keep_rc (bool) -- whether to return the error code returned by the process executing the command. If False, any return code other than zero with raise a CalledProcessError. If True, the return code and output are always returned.

  • timeout (int) -- Number of seconds to wait for the command to complete. If the timeout is reached before the command returns, a CalledProcessError is raised. If None, timeout is infinite.

  • inherit_env (list) -- The list of environment variables to inherit from the calling process when the command is invoked.

Returns

A datasource that returns a list of outputs for each command

created by substituting each element of provider into the cmd template.

Return type

function

insights.core.spec_factory.deserialize_command_output(_type, data, root)[source]
insights.core.spec_factory.deserialize_container_command(_type, data, root)[source]
insights.core.spec_factory.deserialize_container_file(_type, data, root)[source]
insights.core.spec_factory.deserialize_datasource_provider(_type, data, root)[source]
insights.core.spec_factory.deserialize_raw_file_provider(_type, data, root)[source]
insights.core.spec_factory.deserialize_text_provider(_type, data, root)[source]
insights.core.spec_factory.enc(s)[source]
insights.core.spec_factory.escape(s)[source]
class insights.core.spec_factory.find(spec, pattern)[source]

Bases: object

Helper class for extracting specific lines from a datasource for direct consumption by a rule.

service_starts = find(Specs.audit_log, "SERVICE_START")

@rule(service_starts)
def report(starts):
    return make_info("SERVICE_STARTS", num_starts=len(starts))
Parameters
  • spec (datasource) -- some datasource, ideally filterable.

  • pattern (string / list) -- a string or list of strings to match (no patterns supported)

Returns

A dict where each key is a command, path, or spec name, and each value is a non-empty list of matching lines. Only paths with matching lines are included.

Raises

SkipComponent -- if no paths have matching lines.

class insights.core.spec_factory.first_file(paths, context=None, deps=[], kind=<class 'insights.core.spec_factory.TextFileProvider'>, **kwargs)[source]

Bases: object

Creates a datasource that returns the first existing and readable file in files.

Parameters
  • files (str) -- list of paths to find and read

  • context (ExecutionContext) -- the context under which the datasource should run.

  • kind (FileProvider) -- One of TextFileProvider or RawFileProvider.

Returns

A datasource that returns the first file in files that exists

and is readable

Return type

function

class insights.core.spec_factory.first_of(deps)[source]

Bases: object

Given a list of dependencies, returns the first of the list that exists in the broker. At least one must be present, or this component won’t fire.

class insights.core.spec_factory.foreach_collect(provider, path, ignore=None, context=<class 'insights.core.context.HostContext'>, deps=[], kind=<class 'insights.core.spec_factory.TextFileProvider'>, **kwargs)[source]

Bases: object

Subtitutes each element in provider into path and collects the files at the resulting paths.

Parameters
  • provider (list) -- a list of elements or tuples.

  • path (str) -- a path template with substitution parameters.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • kind (FileProvider) -- one of TextFileProvider or RawFileProvider

Returns

A datasource that returns a list of file contents created by

substituting each element of provider into the path template.

Return type

function

class insights.core.spec_factory.foreach_execute(provider, cmd, context=<class 'insights.core.context.HostContext'>, deps=None, split=True, keep_rc=False, timeout=None, inherit_env=None, override_env=None, signum=None, **kwargs)[source]

Bases: object

Execute a command for each element in provider. Provider is the output of a different datasource that returns a list of single elements or a list of tuples. The command should have %s substitution parameters equal to the number of elements in each tuple of the provider.

Parameters
  • provider (list) -- a list of elements or tuples.

  • cmd (str) -- a command with substitution parameters. Breaking apart a command string that might contain multiple commands separated by a pipe, getting them ready for subproc operations. IE. A command with filters applied

  • context (ExecutionContext) -- the context under which the datasource should run.

  • split (bool) -- whether the output of the command should be split into a list of lines

  • keep_rc (bool) -- whether to return the error code returned by the process executing the command. If False, any return code other than zero with raise a CalledProcessError. If True, the return code and output are always returned.

  • timeout (int) -- Number of seconds to wait for the command to complete. If the timeout is reached before the command returns, a CalledProcessError is raised. If None, timeout is infinite.

  • inherit_env (list) -- The list of environment variables to inherit from the calling process when the command is invoked.

  • override_env (dict) -- A dict of environment variables to override from the calling process when the command is invoked.

Returns

A datasource that returns a list of outputs for each command

created by substituting each element of provider into the cmd template.

Return type

function

class insights.core.spec_factory.glob_file(patterns, ignore=None, context=None, deps=[], kind=<class 'insights.core.spec_factory.TextFileProvider'>, max_files=1000, **kwargs)[source]

Bases: object

Creates a datasource that reads all files matching the glob pattern(s).

Parameters
  • patterns (str or [str]) -- glob pattern(s) of paths to read.

  • ignore (regex) -- a regular expression that is used to filter the paths matched by pattern(s).

  • context (ExecutionContext) -- the context under which the datasource should run.

  • kind (FileProvider) -- One of TextFileProvider or RawFileProvider.

  • max_files (int) -- Maximum number of glob files to process.

Returns

A datasource that reads all files matching the glob patterns.

Return type

function

class insights.core.spec_factory.head(dep, **kwargs)[source]

Bases: object

Return the first element of any datasource that produces a list.

class insights.core.spec_factory.listdir(path, context=None, ignore=None, deps=[])[source]

Bases: object

Execute a simple directory listing of all the files and directories in path.

Parameters
  • path (str) -- directory or glob pattern to list.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • ignore (str) -- regular expression defining paths to ignore.

Returns

A datasource that returns the list of files and directories

in the directory specified by path

Return type

function

insights.core.spec_factory.mangle_command(command, name_max=255)[source]

Mangle a command line string into something suitable for use as the basename of a filename. At minimum this function must remove slashes, but it also does other things to clean the basename: removing directory names from the command name, replacing many non- characters with undersores, in addition to replacing slashes with dots.

By default, curly braces, ‘{’ and ‘}’, are replaced with underscore, set ‘has_variables’ to leave curly braces alone.

This function was copied from the function that insights-client uses to create the name it to capture the output of the command.

Here, server side, it is used to figure out what file in the archive contains the output a command. Server side, the command may contain references to variables (names matching curly braces) that will be expanded before the name is actually used as a file name.

To completly mimic the insights-client behavior, curly braces need to be replaced underscores. If the command has variable references, the curly braces must be left alone. Set has_variables, to leave curly braces alone.

This implementation of ‘has_variables’ assumes that variable names only contain that are not replaced by mangle_command.

insights.core.spec_factory.serialize_command_output(obj, root)[source]
insights.core.spec_factory.serialize_container_command(obj, root)[source]
insights.core.spec_factory.serialize_container_file_output(obj, root)[source]
insights.core.spec_factory.serialize_datasource_provider(obj, root)[source]
insights.core.spec_factory.serialize_raw_file_provider(obj, root)[source]
insights.core.spec_factory.serialize_text_file_provider(obj, root)[source]
class insights.core.spec_factory.simple_command(cmd, context=<class 'insights.core.context.HostContext'>, deps=None, split=True, keep_rc=False, timeout=None, inherit_env=None, override_env=None, signum=None, **kwargs)[source]

Bases: object

Execute a simple command that has no dynamic arguments

Parameters
  • cmd (str) -- the command(s) to execute. Breaking apart a command string that might contain multiple commands separated by a pipe, getting them ready for subproc operations. IE. A command with filters applied

  • context (ExecutionContext) -- the context under which the datasource should run.

  • split (bool) -- whether the output of the command should be split into a list of lines

  • keep_rc (bool) -- whether to return the error code returned by the process executing the command. If False, any return code other than zero with raise a CalledProcessError. If True, the return code and output are always returned.

  • timeout (int) -- Number of seconds to wait for the command to complete. If the timeout is reached before the command returns, a CalledProcessError is raised. If None, timeout is infinite.

  • inherit_env (list) -- The list of environment variables to inherit from the calling process when the command is invoked.

  • override_env (dict) -- A dict of environment variables to override from the calling process when the command is invoked.

Returns

A datasource that returns the output of a command that takes

no arguments

Return type

function

class insights.core.spec_factory.simple_file(path, context=None, deps=[], kind=<class 'insights.core.spec_factory.TextFileProvider'>, **kwargs)[source]

Bases: object

Creates a datasource that reads the file at path when evaluated.

Parameters
  • path (str) -- path to the file to read

  • context (ExecutionContext) -- the context under which the datasource should run.

  • kind (FileProvider) -- One of TextFileProvider or RawFileProvider.

Returns

A datasource that reads all files matching the glob patterns.

Return type

function

insights.core.taglang

Simple language for defining predicates against a list or set of strings.

Operator Precedence:
  • ! high - opposite truth value of its predicate

  • / high - starts a regex that continues until whitespace unless quoted

  • & medium - “and” of two predicates

  • | low - “or” of two predicates

  • , low - “or” of two predicates. Synonym for |.

It supports grouping with parentheses and quoted strings/regexes surrounded with either single or double quotes.

Examples

>>> pred = parse("a | b & !c")  # means (a or (b and (not c)))
>>> pred(["a"])
True
>>> pred(["b"])
True
>>> pred(["b", "c"])
False
>>> pred = parse("/net | apache")
>>> pred(["networking"])
True
>>> pred(["mynetwork"])
True
>>> pred(["apache"])
True
>>> pred(["security"])
False
>>> pred = parse("(a | b) & c")
>>> pred(["a", "c"])
True
>>> pred(["b", "c"])
True
>>> pred(["a"])
False
>>> pred(["b"])
False
>>> pred(["c"])
False

Regular expressions start with a forward slash / and continue until whitespace unless they are quoted with either single or double quotes. This means that they can consume what would normally be considered an operator or a closing parenthesis if you aren’t careful.

For example, this is a parse error because the regex consumes the comma:
>>> pred = parse("/net, apache")
Exception
Instead, do this:
>>> pred = parse("/net , apache")
or this:
>>> pred = parse("/net | apache")
or this:
>>> pred = parse("'/net', apache")
class insights.core.taglang.And(left, right)[source]

Bases: Predicate

The values must satisfy both the left and the right condition.

test(value)[source]
class insights.core.taglang.Eq(value)[source]

Bases: Predicate

The value must be in the set of values.

test(values)[source]
class insights.core.taglang.Not(pred)[source]

Bases: Predicate

The values must not satisfy the wrapped condition.

test(value)[source]
class insights.core.taglang.Or(left, right)[source]

Bases: Predicate

The values must satisfy either the left or the right condition.

test(value)[source]
class insights.core.taglang.Predicate[source]

Bases: object

Provides __call__ for invoking the Predicate like a function without having to explictly call its test method.

class insights.core.taglang.Regex(value)[source]

Bases: Predicate

The regex must match at least one of the values.

test(values)[source]
insights.core.taglang.negate(args)[source]
insights.core.taglang.oper(args)[source]

insights.parsers

insights.parsers.calc_offset(lines, target, invert_search=False, require_all=False)[source]

Function to search for a line in a list starting with a target string. If target is None or an empty string then 0 is returned. This allows checking target here instead of having to check for an empty target in the calling function. Each line is stripped of leading spaces prior to comparison with each target however target is not stripped. See parse_fixed_table in this module for sample usage.

Parameters
  • lines (list) -- List of strings.

  • target (list) -- List of strings to search for at the beginning of any line in lines.

  • invert_search (boolean) -- If True this flag causes the search to continue until the first line is found not matching anything in target. An empty line is implicitly included in target. Default is False. This would typically be used if trimming trailing lines off of a file by passing reversed(lines) as the lines argument.

  • require_all (boolean) -- If True this flag causes the search to also require all the items of the target being in the line. This flag only works with invert_search == False, when invert_search is True, it will be ignored.

Returns

index into the lines indicating the location of target. If target is None or an empty string 0 is returned as the offset. If invert_search is True the index returned will point to the line after the last target was found.

Return type

int

Raises

ValueError -- Exception is raised if target string is specified and it was not found in the input lines.

Examples

>>> lines = [
... '#   ',
... 'Warning line',
... 'Error line',
... '    data 1 line',
... '    data 2 line']
>>> target = ['data', '2', 'line']
>>> calc_offset(lines, target)
3
>>> target = ['#', 'Warning', 'Error']
>>> calc_offset(lines, target, invert_search=True)
3
>>> target = ['data', '2', 'line']
>>> calc_offset(lines, target, require_all=True)
4
>>> target = ['#', 'Warning', 'Error']
>>> calc_offset(lines, target, invert_search=True, require_all=True)  # `require_all` doesn't work when `invert_search=True`
3
insights.parsers.get_active_lines(lines, comment_char='#')[source]

Returns lines, or parts of lines, from content that are not commented out or completely empty. The resulting lines are all individually stripped.

This is useful for parsing many config files such as ifcfg.

Parameters
  • lines (list) -- List of strings to parse.

  • comment_char (str) -- String indicating that all chars following are part of a comment and will be removed from the output.

Returns

List of valid lines remaining in the input.

Return type

list

Examples

>>> lines = [
... 'First line',
... '   ',
... '# Comment line',
... 'Inline comment # comment',
... '          Whitespace          ',
... 'Last line']
>>> get_active_lines(lines)
['First line', 'Inline comment', 'Whitespace', 'Last line']

Takes a list of dictionaries and finds all the dictionaries where the keys and values match those found in the keyword arguments.

Keys in the row data have ‘ ‘ and ‘-’ replaced with ‘_’, so they can match the keyword argument parsing. For example, the keyword argument ‘fix_up_path’ will match a key named ‘fix-up path’.

In addition, several suffixes can be added to the key name to do partial matching of values:

  • ‘__contains’ will test whether the data value contains the given value.

  • ‘__startswith’ tests if the data value starts with the given value

  • ‘__lower_value’ compares the lower-case version of the data and given values.

Parameters
  • rows (list) -- A list of dictionaries representing the data to be searched.

  • **kwargs (dict) -- keyword-value pairs corresponding to the fields that need to be found and their required values in the data rows.

Returns

The list of rows that match the search keywords. If no keyword arguments are given, no rows are returned.

Return type

(list)

Examples

>>> rows = [
...     {'domain': 'oracle', 'type': 'soft', 'item': 'nofile', 'value': 1024},
...     {'domain': 'oracle', 'type': 'hard', 'item': 'nofile', 'value': 65536},
...     {'domain': 'oracle', 'type': 'soft', 'item': 'stack', 'value': 10240},
...     {'domain': 'oracle', 'type': 'hard', 'item': 'stack', 'value': 3276},
...     {'domain': 'root', 'type': 'soft', 'item': 'nproc', 'value': -1}]
...
>>> keyword_search(rows, domain='root')
[{'domain': 'root', 'type': 'soft', 'item': 'nproc', 'value': -1}]
>>> keyword_search(rows, item__contains='c')
[{'domain': 'oracle', 'type': 'soft', 'item': 'stack', 'value': 10240},
 {'domain': 'oracle', 'type': 'hard', 'item': 'stack', 'value': 3276},
 {'domain': 'root', 'type': 'soft', 'item': 'nproc', 'value': -1}]
>>> keyword_search(rows, domain__startswith='r')
[{'domain': 'root', 'type': 'soft', 'item': 'nproc', 'value': -1}]
insights.parsers.optlist_to_dict(optlist, opt_sep=',', kv_sep='=', strip_quotes=False)[source]

Parse an option list into a dictionary.

Takes a list of options separated by opt_sep and places them into a dictionary with the default value of True. If kv_sep option is specified then key/value options key=value are parsed. Useful for parsing options such as mount options in the format rw,ro,rsize=32168,xyz.

Parameters
  • optlist (str) -- String of options to parse.

  • opt_sep (str) -- Separater used to split options.

  • kv_sep (str) -- If not None then optlist includes key=value pairs to be split, and this str is used to split them.

  • strip_quotes (bool) -- If set, will remove matching ‘”’ and ‘”’ characters from start and end of line. No quotes are removed from inside the string and mismatched quotes are not removed.

Returns

Returns a dictionary of names present in the list. If kv_sep is not None then the values will be the str on the right-hand side of kv_sep. If kv_sep is None then each key will have a default value of True.

Return type

dict

Examples

>>> optlist = 'rw,ro,rsize=32168,xyz'
>>> optlist_to_dict(optlist)
{'rw': True, 'ro': True, 'rsize': '32168', 'xyz': True}
insights.parsers.parse_delimited_table(table_lines, delim=None, max_splits=-1, strip=True, header_delim='same as delimiter', heading_ignore=None, header_substitute=None,