API Documentation

insights.core

class insights.core.CommandParser(context, extra_bad_lines=None)[source]

Bases: Parser

This class checks output from the command defined in the spec.

Raises:

ContentException -- When context.content contains a single line and that line contains one of the string in the bad_single_lines or extra_bad_lines list. Or, when context.content contains multiple lines and there is one line contains one of the string in the bad_lines or extra_bad_lines list.

static validate_lines(results, bad_single_lines, bad_lines)[source]

This function returns False when:

1. If the `results` is a single line and that line contains
   one of the string in the `bad_single_lines` list.
2. If the `results` contains multiple lines and there is one line
   contains one of the string in the `bad_lines` list.

If no bad line is found the function returns True.

Parameters:
  • results (list) -- The results string of the output from the command defined by the command spec.

  • bad_single_lines (list) -- The list of bad lines should be checked only when the result contains a single line.

  • bad_lines (list) -- The list of bad lines should be checked only when the result contains multiple lines.

Returns:

True for no bad lines or False for bad line found.

Return type:

(Boolean)

class insights.core.ConfigCombiner(confs, main_file, include_finder)[source]

Bases: ConfigComponent

Base Insights component class for Combiners of configuration files with include directives for supplementary configuration files. httpd and nginx are examples.

find_main(name)[source]
find_matches(confs, pattern)[source]
class insights.core.ConfigComponent[source]

Bases: object

property directives
find(*queries, **kwargs)[source]

Finds matching results anywhere in the configuration

find_all(*queries, **kwargs)

Finds matching results anywhere in the configuration

property sections
select(*queries, **kwargs)[source]

Given a list of queries, executes those queries against the set of Nodes. A Node has three primary attributes: name (str), attrs ([str|int]), and children ([Node]).

Nodes also have a value attribute that is either the first attribute (in the case of simple directives that only have one), or the string representation of all attributes joined by a single space.

Each positional argument to select represents a query against the name and/or attributes of the corresponding level of the configuration tree. The first argument queries root nodes, the second argument queries children of the root nodes, etc.

An individual query is either a single value or a tuple. A single value queries the name of a Node. A tuple queries the name and the attrs.

So: select(name_predicate) or select((name_predicate, attrs_predicate))

In general, select(pred1, pred2, pred3, …)

If a predicate is a simple value (string or int), an exact match is required for names, and an exact match of any attribute is required for attributes.

Examples: select(“Directory”) queries for all root nodes named Directory.

select(“Directory”, “Options”) queries for all root nodes named Directory that contain at least one child node named Options. Notice the argument positions: Directory is in position 1, and Options is in position 2.

select((“Directory”, “/”)) queries for all root nodes named Directory that contain an attribute exactly matching “/”. Notice this is one argument to select: a 2-tuple with predicates for name and attrs.

If you are only interested in attributes, just pass None for the name predicate in the tuple: select((None, “/”)) will return all root nodes with at least one attribute of “/”

In addition to exact matches, the elements of a query can be functions that accept the value corresponding to their position in the query. A handful of useful functions and boolean operators between them are provided.

select(startswith(“Dir”)) queries for all root nodes with names starting with “Dir”.

select(~startswith(“Dir”)) queries for all root nodes with names not starting with “Dir”.

select(startswith(“Dir”) | startswith(“Ali”)) queries for all root nodes with names starting with “Dir” or “Ali”. The return of | is a single callable passed in the first argument position of select.

select(~startswith(“Dir”) & ~startswith(“Ali”)) queries for all root nodes with names not starting with “Dir” or “Ali”.

If a function is in an attribute position, it is considered True if it returns True for any attribute.

For example, select((None, 80)) often will return the list of one Node [Listen 80]

select((“Directory”, startswith(“/var”))) will return all root nodes named Directory that also have an attribute starting with “/var”

If you know that your selection will only return one element, or you only want the first or last result of the query , pass one=first or one=last.

select((“Directory”, startswith(“/”)), one=last) will return the single root node for the last Directory entry starting with “/”

If instead of the root nodes that match you want the child nodes that caused the match, pass roots=False.

node = select((“Directory”, “/var/www/html”), “Options”, one=last, roots=False) might return the Options node if the Directory for “/var/www/html” was defined and contained an Options Directive. You could then access the attributes with node.attrs. If the query didn’t match anything, it would have returned None.

If you want to slide the query down the branches of the config, pass deep=True to select. That allows you to do conf.select(“Directory”, deep=True, roots=False) and get back all Directory nodes regardless of nesting depth.

conf.select() returns everything.

Available predicates are: & (infix boolean and) | (infix boolean or) ~ (prefix boolean not)

For ints or strings: eq (==) e.g. conf.select(“Directory, (“StartServers”, eq(4))) ge (>=) e.g. conf.select(“Directory, (“StartServers”, ge(4))) gt (>) le (<=) lt (<)

For strings: contains endswith startswith

class insights.core.ConfigParser(context)[source]

Bases: Parser, ConfigComponent

Base Insights component class for Parsers of configuration files.

Raises:

SkipComponent -- When input content is empty.

lineat(pos)[source]
parse_content(content)[source]

This method must be implemented by classes based on this class.

parse_doc(content)[source]
class insights.core.ContainerConfigCombiner(confs, main_file, include_finder, engine, image, container_id)[source]

Bases: ConfigCombiner

Base Insights component class for Combiners of container configuration files with include directives for supplementary configuration files. httpd and nginx are examples.

property conf_path
container_id

The ID of the container.

Type:

str

engine

The engine provider of the container.

Type:

str

image

The image of the container.

Type:

str

class insights.core.ContainerParser(context)[source]

Bases: CommandParser

A class specifically for container parser, with the “image” name, the engine provider and the container ID on the basis of Parser.

container_id

The ID of the container.

Type:

str

engine

The engine provider of the container.

Type:

str

image

The image of the container.

Type:

str

class insights.core.IniConfigFile(context)[source]

Bases: ConfigParser

A class specifically for reading configuration files in ‘ini’ format.

The input file format supported by this class is:

[section 1]
key = value
; comment
# comment
[section 2]
key with spaces = value string
[section 3]
# Must implement parse_content in child class
# and pass allow_no_value=True to parent class
# to enable keys with no values
key_with_no_value

Examples

>>> class MyConfig(IniConfigFile):
...     pass
>>> content = '''
... [defaults]
... admin_token = ADMIN
... [program opts]
... memsize = 1024
... delay = 1.5
... [logging]
... log = true
... logging level = verbose
... '''.split()
>>> my_config = MyConfig(context_wrap(content, path='/etc/myconfig.conf'))
>>> 'program opts' in my_config
True
>>> my_config.sections()
['program opts', 'logging']
>>> my_config.defaults()
{'admin_token': 'ADMIN'}
>>> my_config.items('program opts')
{'memsize': 1024, 'delay': 1.5}
>>> my_config.get('logging', 'logging level')
'verbose'
>>> my_config.getint('program opts', 'memsize')
1024
>>> my_config.getfloat('program opts', 'delay')
1.5
>>> my_config.getboolean('logging', 'log')
True
>>> my_config.has_option('logging', 'log')
True
property data

Returns: obj: self, it’s for backward compatibility.

defaults()[source]
Returns:

Returns any options under the DEFAULT section.

Return type:

dict

get(section, option)[source]
Parameters:
  • section (str) -- The section str to search for.

  • option (str) -- The option str to search for.

Returns:

Returns the value of the option in the specified section.

Return type:

str

getboolean(section, option)[source]
Returns:

Returns boolean form based on the data from get.

Return type:

bool

getfloat(section, option)[source]
Returns:

Returns the float value off the data from get.

Return type:

float

getint(section, option)[source]
Returns:

Returns the int value off the data from get.

Return type:

int

has_option(section, option)[source]
Parameters:
  • section (str) -- The section str to search for.

  • option (str) -- The option str to search for.

Returns:

Returns weather the option in the section exist.

Return type:

bool

items(section)[source]
Parameters:

section (str) -- The section str to search for.

Returns:

Returns all of the options in the specified section.

Return type:

dict

parse_content(content, allow_no_value=False)[source]

This method must be implemented by classes based on this class.

parse_doc(content)[source]
sections()[source]
Returns:

Returns all of the parsed sections excluding DEFAULT.

Return type:

list

set(section, option, value=None)[source]

Sets the value of the specified section option.

Parameters:
  • section (str) -- The section str to set for.

  • option (str) -- The option str to set for.

  • value (str) -- The value to set.

class insights.core.JSONParser(context)[source]

Bases: Parser, LegacyItemAccess

A parser class that reads JSON files. Base your own parser on this.

data

The loaded json content

Type:

dict

unparsed_lines

The skipped unparsed lines

Type:

list

Raises:
  • ParseException -- When any error be thrown during the json loading of content.

  • SkipComponent -- When content is empty or the loaded data is empty.

parse_content(content)[source]

This method must be implemented by classes based on this class.

class insights.core.LazyLogFileOutput(context)[source]

Bases: LogFileOutput

Another class for parsing log file content. Doesn’t like the LogFileOutput, this LazyLogFileOutput doesn’t load the content during initialization. Its content will be loaded later whenever the parser instance being used. It’s useful for the cases where need to load thousands of files that belong to one single Spec in one pass of running. If any “scan” functions are pre-defined with it, to ensure the “scan” results being available, the do_scan method should be called explicitly before using them. Other than the lazy content loading feature, it’s the same as its base LogFileOutput.

Examples

>>> class LzayLogOne(LazyLogFileOutput):
>>> LazyLogOne.keep_scan('get_one', 'one')
>>> LazyLogOne.last_scan('last_match', 'file')
>>> LazyLogOne.token_scan('find_it', 'more')
>>> my_log1 = LazyLogOne(context_wrap(contents, path='/var/log/log1'))
>>> hasattr(my_log1, 'get_one')
False
>>> hasattr(my_log1, 'last_match')
False
>>> hasattr(my_log1, 'find_id')
False
>>> my_log1.do_scan('get_one')
>>> my_log1.get_one
[{'raw_line': 'Text file line one'}]
>>> my_log1.do_scan()
>>> hasattr(my_log1, 'last_match')
True
>>> hasattr(my_log1, 'find_id')
True
>>> my_log2 = LazyLogOne(context_wrap(contents, path='/var/log/log2'))
>>> my_log2.get(['three', 'more'])
[{'raw_line': 'Text file line three, and more'}]
do_scan(result_key=None)[source]

Do the actual scanning operations as per the specified result_key. When result_key is not specified, all registered scanners will be executed. Each registered scanner can only be executed once.

property lines
scanners = {}
class insights.core.LegacyItemAccess[source]

Bases: object

Mixin class to provide legacy access to self.data attribute.

Provides expected passthru functionality for classes that still use self.data as the primary data structure for all parsed information. Use this as a mixin on parsers that expect these methods to be present as they were previously.

Examples

>>> class MyParser(LegacyItemAccess, Parser):
...     def parse_content(self, content):
...         self.data = {}
...         for line in content:
...             if 'fact' in line:
...                 k, v = line.split('=')
...                 self.data[k.strip()] = v.strip()
>>> content = '''
... # Comment line
... fact1=fact 1
... fact2=fact 2
... fact3=fact 3
... '''.strip()
>>> my_parser = MyParser(context_wrap(content, path='/etc/path_to_content/content.conf'))
>>> my_parser.data
{'fact1': 'fact 1', 'fact2': 'fact 2', 'fact3': 'fact 3'}
>>> my_parser.file_path
'/etc/path_to_content/content.conf'
>>> my_parser.file_name
'content.conf'
>>> my_parser['fact1']
'fact 1'
>>> 'fact2' in my_parser
True
>>> my_parser.get('fact3', default='no fact')
'fact 3'
get(item, default=None)[source]

Returns value of key item in self.data or default if key is not present.

Parameters:
  • item -- Key to get from self.data.

  • default -- Default value to return if key is not present.

Returns:

String value of the stored item, or the default if not found.

Return type:

(str)

class insights.core.LogFileOutput(context)[source]

Bases: TextFileOutput

Class for parsing log file content. For more details check it’s super class parser TextFileOutput.

An extra get_after method is provided in this LogFileOutput, it depends on the time_format static variable, to ensure the get_after method work, the time_format should be pre-defined according to the time format used in the log file.

get_after(timestamp, s=None)[source]

Find all the (available) logs that are after the given time stamp.

If s is not supplied, then all lines are used. Otherwise, only the lines contain the s are used. s can be either a single string or a string list. For list, all keywords in the list must be found in each line.

This method then finds all lines which have a time stamp after the given timestamp. Lines that do not contain a time stamp are considered to be part of the previous line and are therefore included if the last log line was included or excluded otherwise.

Time stamps are recognised by converting the time format into a regular expression which matches the time format in the string. This is then searched for in each line in turn. Only lines with a time stamp matching this expression will trigger the decision to include or exclude lines. Therefore, if the log for some reason does not contain a time stamp that matches this format, no lines will be returned.

The time format is given in strptime() format, in the object’s time_format property. Users of the object should not change this property; instead, the parser should subclass LogFileOutput and change the time_format property.

Some logs, regrettably, change time stamps formats across different lines, or change time stamp formats in different versions of the program. In order to accommodate this, the timestamp format can be a list of strptime() format strings. These are combined as alternatives in the regular expression, and are given to strptime in order. These can also be listed as the values of a dict, e.g.:

{'pre_10.1.5': '%y%m%d %H:%M:%S', 'post_10.1.5': '%Y-%m-%d %H:%M:%S'}

Note

Some logs - notably /var/log/messages - do not contain a year in the timestamp. This detected by the absence of a ‘%y’ or ‘%Y’ in the time format. If that year field is absent, the year is assumed to be the year in the given timestamp being sought. Some attempt is made to handle logs with a rollover from December to January, by finding when the log’s timestamp (with current year assumed) is over eleven months (specifically, 330 days) ahead of or behind the timestamp date and shifting that log date by 365 days so that it is more likely to be in the sought range. This paragraph is sponsored by syslog.

Parameters:
  • timestamp (datetime.datetime) -- lines before this time are ignored.

  • s (str or list) -- one or more strings to search for. If not supplied, all available lines are searched.

Yields:

dict -- The parsed lines with timestamps after this date in the same format they were supplied. It at least contains the raw_message as a key.

Raises:

ParseException -- If the format conversion string contains a format that we don’t recognise. In particular, no attempt is made to recognise or parse the time zone or other obscure values like day of year or week of year.

scanners = {}
time_format = '%Y-%m-%d %H:%M:%S'

The timestamp format assumed for the log files. A subclass can override this for files that have a different timestamp format. This can be:

  • A string in strptime() format.

  • A list of strptime() strings.

  • A dictionary with each item’s value being a strptime() string. This allows the item keys to provide some form of documentation.

  • A None value when there is no timestamp info in the log file

class insights.core.Parser(context)[source]

Bases: object

Base class designed to be subclassed by parsers.

The framework will construct your object with a Context that will provide at least the content as an interable of lines and the path that the content was retrieved from.

Facts should be exposed as instance members where applicable. For example:

self.fact = "123"

Examples

>>> class MyParser(Parser):
...     def parse_content(self, content):
...         self.facts = []
...         for line in content:
...             if 'fact' in line:
...                 self.facts.append(line)
>>> content = '''
... # Comment line
... fact=fact 1
... fact=fact 2
... fact=fact 3
... '''.strip()
>>> my_parser = MyParser(context_wrap(content, path='/etc/path_to_content/content.conf'))
>>> my_parser.facts
['fact=fact 1', 'fact=fact 2', 'fact=fact 3']
>>> my_parser.file_path
'/etc/path_to_content/content.conf'
>>> my_parser.file_name
'content.conf'
file_name

Filename portion of the input file.

Type:

str

file_path

Full context path of the input file.

Type:

str

parse_content(content)[source]

This method must be implemented by classes based on this class.

class insights.core.ScanMeta(name, parents, dct)[source]

Bases: type

class insights.core.Scannable(*args, **kwargs)[source]

Bases: Parser

A class to enable early and easy collection of data in a file.

The Scannable class makes it easy to collect two common types of information from a data file:

  • A flag to indicate that the data contains one or more lines with a given string.

  • a list of lines containing a given string.

To create a parser from the Scannable parser class, the main job is to override the parse() method, returning your choice of data structure to represent the information in the file. This takes the form of a generator that yields structures for users of your parser. You can yield more than object per line, or you can condense multiple lines into one object. Each object is then scanned with all the defined scanners for this class.

How does that work? Well, the individual rules using your parser will use the any() and collect() methods on the class object itself to set up new attributes of the class that will be given values based on the results of a function that checks each object from your parser for the properties it’s looking for. That’s pretty vague, so let’s give some examples - imagine a parser defined as:

class AnacondaLog(Scannable):

pass

(Which uses the default parse() function that simply yields each line in turn.) A rule using this parser then does:

def warnings(line):

return line if ‘WARNING’ in line

def has_fcoe_edd(line):

return ‘/usr/libexec/fcoe/fcoe_edd.sh’ in line

AnacondaLog.any(‘has_fcoe’, has_fcoe_edd) AnacondaLog.collect(‘warnings’, warnings)

These then act in the following way:

  • When an object is instantiated from the AnacondaLog class, it will have the ‘has_fcoe’ attribute. This will be set to True if ‘/usr/libexec/fcoe/fcoe_edd.sh’ was found in any line in the file, or False otherwise.

  • When an object is instantiated from the AnacondaLog class, it will have the ‘warnings’ attribute. This will be a list containing all the lines found.

Users of your class can supply any function to either any() or collect(). Functions given to collect() can return anything they want to be collected - if they return something that evaluates to False then nothing is collected (so avoid returning empty lists, empty dicts, empty strings or False).

classmethod any(result_key, func)[source]

Sets the result_key to the output of func if func ever returns truthy

classmethod collect(result_key, func)[source]

Sets the result_key to an iterable of objects for which func(obj) returns True

parse(content)[source]

Default ‘parsing’ method. Subclasses should override this method with their own custom parsing as necessary.

parse_content(content)[source]

This method must be implemented by classes based on this class.

scanners = {}
class insights.core.StreamParser(context)[source]

Bases: Parser

Parsers that don’t have to store lines or look back in the data stream should implement StreamParser instead of Parser as it is more memory efficient. The only difference between StreamParser and Parser is that StreamParser.parse_content will receive a generator instead of a list.

class insights.core.SysconfigOptions(context)[source]

Bases: Parser, LegacyItemAccess

A parser to handle the standard ‘keyword=value’ format of files in the /etc/sysconfig directory. These are provided in the standard ‘data’ dictionary.

Examples

>>> 'OPTIONS' in ntpconf
True
>>> 'NOT_SET' in ntpconf
False
>>> 'COMMENTED_OUT' in ntpconf
False
>>> ntpconf['OPTIONS']
'-x -g'

For common variables such as OPTIONS, it is recommended to set a specific property in the subclass that fetches this option with a fallback to a default value.

Example subclass:

class DirsrvSysconfig(SysconfigOptions):

    @property
    def options(self):
        return self.data.get('OPTIONS', '')
keys()[source]

Return the list of keys (in no order) in the underlying dictionary.

parse_content(content)[source]

This method must be implemented by classes based on this class.

class insights.core.Syslog(context)[source]

Bases: LogFileOutput

Class for parsing syslog file content.

The important function is get(s)(), which finds all lines with the string s and parses them into dictionaries with the following keys:

  • timestamp - the time the log line was written

  • procname - the process or facility that wrote the line

  • hostname - the host that generated the log line

  • message - the rest of the message (after the process name)

  • raw_message - the raw message before being split.

It is best to use filters and/or scanners with the messages log, to speed up parsing. These work on the raw message, before being parsed.

Sample log lines:

May  9 15:13:34 lxc-rhel68-sat56 jabberd/sm[11057]: session started: jid=rhn-dispatcher-sat@lxc-rhel6-sat56.redhat.com/superclient
May  9 15:13:36 lxc-rhel68-sat56 wrapper[11375]: --> Wrapper Started as Daemon
May  9 15:13:36 lxc-rhel68-sat56 wrapper[11375]: Launching a JVM...
May 10 15:24:28 lxc-rhel68-sat56 yum[11597]: Installed: lynx-2.8.6-27.el6.x86_64
May 10 15:36:19 lxc-rhel68-sat56 yum[11954]: Updated: sos-3.2-40.el6.noarch

Examples

>>> Syslog.token_scan('daemon_start', 'Wrapper Started as Daemon')
>>> Syslog.token_scan('yum_updated', ['yum', 'Updated'])
>>> Syslog.keep_scan('yum_lines', 'yum')
>>> Syslog.keep_scan('yum_installed_lines', ['yum', 'Installed'])
>>> syslog.get('wrapper')[0]
{'timestamp': 'May  9 15:13:36', 'hostname': 'lxc-rhel68-sat56',
 'procname': wrapper[11375]', 'message': '--> Wrapper Started as Daemon',
 'raw_message': 'May  9 15:13:36 lxc-rhel68-sat56 wrapper[11375]: --> Wrapper Started as Daemon'
}
>>> syslog.daemon_start
True
>>> syslog.yum_updated
True
>>> len(syslog.yum_lines)
2
>>> len(syslog.yum_updated_lines)
1

Note

Because syslog timestamps by default have no year, the year of the logs will be inferred from the year in your timestamp. This will also work around December/January crossovers.

get_logs_by_procname(proc)[source]
Parameters:

proc (str) -- The process or facility that you’re looking for

Yields:

(dict) -- The parsed syslog messages produced by that process or facility

scanners = {}
time_format = '%b %d %H:%M:%S'

The timestamp format assumed for the log files. A subclass can override this for files that have a different timestamp format. This can be:

  • A string in strptime() format.

  • A list of strptime() strings.

  • A dictionary with each item’s value being a strptime() string. This allows the item keys to provide some form of documentation.

  • A None value when there is no timestamp info in the log file

class insights.core.TextFileOutput(context)[source]

Bases: Parser

Class for parsing general text file content.

File content is stored in raw format in the lines attribute.

Assume the text file content is:

Text file line one
Text file line two
Text file line three, and more
lines

List of the lines from the text file content.

Type:

list

Examples

>>> class MyTexter(TextFileOutput):
>>> MyTexter.keep_scan('get_one', 'one')
>>> MyTexter.keep_scan('get_three_and_more', ['three', 'more'])
>>> MyTexter.keep_scan('get_one_or_two', ['one', 'two'], check=any)
>>> MyTexter.last_scan('last_line_contains_file', 'file')
>>> MyTexter.keep_scan('last_2_lines_contain_file', 'file', num=2, reverse=True)
>>> MyTexter.keep_scan('last_3_lines_contain_line_and_t', ['line', 't'], num=3, reverse=True)
>>> MyTexter.token_scan('find_more', 'more')
>>> MyTexter.token_scan('find_four_and_more', ['four', 'more'])
>>> MyTexter.token_scan('find_four_or_more', ['four', 'more'], check=any)
>>> my_texter = MyTexter(context_wrap(contents, path='/var/log/text.txt'))
>>> my_texter.file_path
'/var/log/text.txt'
>>> my_texter.file_name
'text.txt'
>>> my_texter.get('two')
[{'raw_line': 'Text file line two'}]
>>> 'line three,' in my_texter
True
>>> my_texter.get(['three', 'more'])
[{'raw_line': 'Text file line three, and more'}]
>>> my_texter.lines[0]
'Text file line one'
>>> my_texter.get_one
[{'raw_line': 'Text file line one'}]
>>> my_texter.get_three_and_more == my_texter.get(['three', 'more'])
True
>>> my_texter.last_line_contains_file
{'raw_line': 'Text file line three, and more'}
>>> len(my_texter.last_2_lines_contain_file)
2
>>> len(my_texter.last_3_lines_contain_line_and_t)  # Only 2 lines contain 'line' and 't'
2
>>> my_texter.find_more
True
>>> my_texter.find_four_and_more
False
>>> my_texter.find_four_or_more
True
get(s, check=<built-in function all>, num=None, reverse=False)[source]

Returns all lines that contain s anywhere and wrap them in a list of dictionaries. s can be either a single string or a string list. For list, all keywords in the list must be found in each line.

Parameters:
  • s (str or list) -- one or more strings to search for

  • check (func) -- built-in function all or any applied to each line

  • num (int) -- the number of lines to get, None for unlimited

  • reverse (bool) -- scan start from the head when False by default, otherwise start from the tail

Returns:

list of dictionaries corresponding to the parsed lines contain the s.

Return type:

(list)

Raises:

TypeError -- When s is not a string or a list of strings, or num is not an integer.

classmethod keep_scan(result_key, token, check=<built-in function all>, num=None, reverse=False)[source]

Define a property that is set to the list of dictionaries of the lines that contain the given token. Uses the get method of the log file.

Parameters:
  • result_key (str) -- the scanner key to register

  • token (str or list) -- one or more strings to search for

  • check (func) -- built-in function all or any applied to each line

  • num (int) -- the number of lines to get, None for unlimited

  • reverse (bool) -- scan start from the head when False by default, otherwise start from the tail

Returns:

list of dictionaries corresponding to the parsed lines contain the token.

Return type:

(list)

classmethod last_scan(result_key, token, check=<built-in function all>)[source]

Define a property that is set to the dictionary of the last line that contains the given token. Uses the get method of the log file.

Parameters:
  • result_key (str) -- the scanner key to register

  • token (str or list) -- one or more strings to search for

  • check (func) -- built-in function all or any applied to each line

Returns:

dictionary corresponding to the last parsed line contains the token.

Return type:

(dict)

parse_content(content)[source]

Use all the defined scanners to search the log file, setting the properties defined in the scanner.

classmethod scan(result_key, func)[source]

Define computed fields based on a string to “grep for”. This is preferred to utilizing raw log lines in plugins because computed fields will be serialized, whereas raw log lines will not.

Raises:

ValueError -- When result_key is already a registered scanner key.

scanners = {}
classmethod token_scan(result_key, token, check=<built-in function all>)[source]

Define a property that is set to true if the given token is found in the log file. Uses the __contains__ method of the log file.

Parameters:
  • result_key (str) -- the scanner key to register

  • token (str or list) -- one or more strings to search for

  • check (func) -- built-in function all or any applied to each line

Returns:

the property will contain True if a line contained (any or all) of the tokens given.

Return type:

(bool)

class insights.core.XMLParser(context)[source]

Bases: LegacyItemAccess, Parser

A parser class that reads XML files. Base your own parser on this.

Examples

>>> content = '''
... <?xml version="1.0"?>
... <data xmlns:fictional="http://characters.example.com"
...       xmlns="http://people.example.com">
...     <country name="Liechtenstein">
...         <rank updated="yes">2</rank>
...         <year>2008</year>
...         <gdppc>141100</gdppc>
...         <neighbor name="Austria" direction="E"/>
...         <neighbor name="Switzerland" direction="W"/>
...     </country>
...     <country name="Singapore">
...         <rank updated="yes">5</rank>
...         <year>2011</year>
...         <gdppc>59900</gdppc>
...         <neighbor name="Malaysia" direction="N"/>
...     </country>
...     <country name="Panama">
...         <rank>68</rank>
...         <year>2011</year>
...         <gdppc>13600</gdppc>
...         <neighbor name="Costa Rica" direction="W"/>
...     </country>
... </data>
... '''.strip()
>>> xml_parser = XMLParser(context_wrap(content))
>>> xml_parser.xmlns
'http://people.example.com'
>>> xml_parser.get_elements(".")[0].tag # Top-level elements
'data'
>>> len(xml_parser.get_elements("./country/neighbor", None)) # All 'neighbor' grand-children of 'country' children of the top-level elements
3
>>> len(xml_parser.get_elements(".//year/..[@name='Singapore']")[0]) # Nodes with name='Singapore' that have a 'year' child
1
>>> xml_parser.get_elements(".//*[@name='Singapore']/year")[0].text # 'year' nodes that are children of nodes with name='Singapore'
'2011'
>>> xml_parser.get_elements(".//neighbor[2]", "http://people.example.com")[0].get('name') # All 'neighbor' nodes that are the second child of their parent
'Switzerland'
raw

raw XML content

Type:

str

dom

Root element of parsed XML file

Type:

Element

xmlns

The default XML namespace, an empty string when no namespace is declared.

Type:

str

data

All required specific properties can be included in data.

Type:

dict

get_elements(element, xmlns=None)[source]

Return a list of elements those match the searching condition. If the XML input has namespaces, elements and attributes with prefixes in the form prefix:sometag get expanded to {namespace}element where the prefix is replaced by the full URI. Also, if there is a default namespace, that full URI gets prepended to all of the non-prefixed tags. Element names can contain letters, digits, hyphens, underscores, and periods. But element names must start with a letter or underscore. Here the while-clause is to set searching condition from /element1/element2 to /{namespace}element1/{namespace}/element2

Parameters:
  • element -- Searching condition to search certain elements in an XML file. For more details about how to set searching condition, refer to section 19.7.2.1. Example and 19.7.2.2. Supported XPath syntax in https://docs.python.org/2/library/xml.etree.elementtree.html

  • xmlns -- XML namespace, default value to None. None means that xmlns equals to the self.xmlns (default namespace) instead of “” all the time. Only string type parameter (including “”) will be regarded as a valid xml namespace.

Returns:

List of elements those match the searching condition

Return type:

(list)

parse_content(content)[source]

All child classes inherit this function to parse XML file automatically. It will call the function parse_dom() by default to parser all necessary data to data and the xmlns (the default namespace) is ready for this function.

parse_dom()[source]

If self.data is required, all child classes need to overwrite this function to set it

class insights.core.YAMLParser(context)[source]

Bases: Parser, LegacyItemAccess

A parser class that reads YAML files. Base your own parser on this.

parse_content(content)[source]

This method must be implemented by classes based on this class.

insights.core.default_parser_deserializer(_type, data, root=None, ctx=None, ds=None)[source]
insights.core.default_parser_serializer(obj)[source]
insights.core.flatten(docs, pred)[source]

Replace include nodes with their config trees. Allows the same files to be included more than once so long as they don’t induce a recursion.

insights.core.context

class insights.core.context.ClusterArchiveContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

class insights.core.context.Context(**kwargs)[source]

Bases: object

product()[source]
stream()[source]
class insights.core.context.Docker(role=None)[source]

Bases: MultiNodeProduct

name = 'docker'
parent_type = 'host'
class insights.core.context.DockerImageContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

class insights.core.context.ExecutionContext(root='/', timeout=None, all_files=None)[source]

Bases: object

check_output(cmd, timeout=None, keep_rc=False, env=None, signum=None)[source]

Subclasses can override to provide special environment setup, command prefixes, etc.

connect(*args, **kwargs)[source]
classmethod handles(files)[source]
locate_path(path)[source]
marker = None
shell_out(cmd, split=True, timeout=None, keep_rc=False, env=None, signum=None)[source]
stream(*args, **kwargs)[source]
class insights.core.context.ExecutionContextMeta(name, bases, dct)[source]

Bases: type

classmethod identify(files)[source]
registry = [<class 'insights.core.context.HostContext'>, <class 'insights.core.context.HostArchiveContext'>, <class 'insights.core.context.SerializedArchiveContext'>, <class 'insights.core.context.SosArchiveContext'>, <class 'insights.core.context.ClusterArchiveContext'>, <class 'insights.core.context.DockerImageContext'>, <class 'insights.core.context.JBossContext'>, <class 'insights.core.context.JDRContext'>, <class 'insights.core.context.InsightsOperatorContext'>, <class 'insights.core.context.MustGatherContext'>, <class 'insights.core.context.OpenStackContext'>]
class insights.core.context.HostArchiveContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

marker = 'insights_commands'
class insights.core.context.HostContext(root='/', timeout=30, all_files=None)[source]

Bases: ExecutionContext

class insights.core.context.InsightsOperatorContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

Recognizes insights-operator archives

marker = 'config/featuregate'
class insights.core.context.JBossContext(root='/', timeout=30, all_files=None)[source]

Bases: HostContext

class insights.core.context.JDRContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

locate_path(path)[source]
marker = 'JBOSS_HOME'
class insights.core.context.MultiNodeProduct(role=None)[source]

Bases: object

is_parent()[source]
class insights.core.context.MustGatherContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

Recognizes must-gather archives

marker = 'cluster-scoped-resources'
class insights.core.context.OSP(role=None)[source]

Bases: MultiNodeProduct

name = 'osp'
parent_type = 'Director'
class insights.core.context.OpenStackContext(hostname)[source]

Bases: ExecutionContext

class insights.core.context.RHEL(version=['-1', '-1'], release=None)[source]

Bases: object

classmethod from_metadata(metadata, processor_obj)[source]
name = 'rhel'
class insights.core.context.RHEV(role=None)[source]

Bases: MultiNodeProduct

name = 'rhev'
parent_type = 'Manager'
class insights.core.context.SerializedArchiveContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

marker = 'insights_archive.txt'
class insights.core.context.SosArchiveContext(root='/', timeout=None, all_files=None)[source]

Bases: ExecutionContext

marker = 'sos_commands'
insights.core.context.create_product(metadata, hostname)[source]
insights.core.context.fs_root(thing)[source]
insights.core.context.get_system(metadata, hostname)[source]
insights.core.context.product(klass)[source]

insights.core.dr

This module implements an inversion of control framework. It allows dependencies among functions and classes to be declared with decorators and the resulting dependency graphs to be executed.

A decorator used to declare dependencies is called a ComponentType, a decorated function or class is called a component, and a collection of interdependent components is called a graph.

In the example below, needs is a ComponentType, one, two, and add are components, and the relationship formed by their dependencies is a graph.

from insights import dr

class needs(dr.ComponentType):
    pass

@needs()
def one():
    return 1

@needs()
def two():
    return 2

@needs(one, two)
def add(a, b):
    return a + b

results = dr.run(add)

Once all components have been imported, the graphs they form can be run. To execute a graph, dr sorts its components into an order that guarantees dependencies are tried before dependents. Components that raise exceptions are considered invalid, and their dependents will not be executed. If a component is skipped because of a missing dependency, its dependents also will not be executed.

During evaluation, results are accumulated into an object called a Broker, which is just a fancy dictionary. Brokers can be inspected after a run for results, exceptions, tracebacks, and execution times. You also can register callbacks with a broker that get invoked after the attempted execution of every component, so you can inspect it during an evaluation instead of at the end.

class insights.core.dr.Broker(seed_broker=None)[source]

Bases: object

The Broker is a fancy dictionary that keeps up with component instances as a graph is evaluated. It’s the state of the evaluation. Once a graph has executed, the broker will contain everything about the evaluation: component instances, timings, exceptions, and tracebacks.

You can either inspect the broker at the end of an evaluation, or you can register callbacks with it, and they’ll get invoked after each component is called.

instances

the component instances with components as keys.

Type:

dict

missing_requirements

components that didn’t have their dependencies met. Values are a two-tuple. The first element is the list of required dependencies that were missing. The second element is the list of “at least one” dependencies that were missing. For more information on dependency types, see the ComponentType docs.

Type:

dict

exceptions

Components that raise any type of exception except SkipComponent during evaluation. The key is the component, and the value is a list of exceptions. It’s a list because some components produce multiple instances.

Type:

defaultdict(list)

tracebacks

keys are exceptions and values are their text tracebacks.

Type:

dict

exec_times

component -> float dictionary where values are the number of seconds the component took to execute. Calculated using time.time(). For components that produce multiple instances, the execution time here is the sum of their individual execution times.

Type:

dict

store_skips

Weather to store skips in the broker or not.

Type:

bool

add_exception(component, ex, tb=None)[source]
add_observer(o, component_type=<class 'insights.core.dr.ComponentType'>)[source]

Add a callback that will get invoked after each component is called.

Parameters:

o (func) -- the callback function

Keyword Arguments:

component_type (ComponentType) -- the ComponentType to observe. The callback will fire any time an instance of the class or its subclasses is invoked.

The callback should look like this:

def callback(comp, broker):
    value = broker.get(comp)
    # do something with value
    pass
fire_observers(component)[source]
get(component, default=None)[source]
get_by_type(_type)[source]

Return all of the instances of ComponentType _type.

items()[source]
keys()[source]
observer(component_type=<class 'insights.core.dr.ComponentType'>)[source]

You can use @broker.observer() as a decorator to your callback instead of Broker.add_observer().

print_component(component_type)[source]
values()[source]
insights.core.dr.add_dependency(component, dep)[source]
insights.core.dr.add_dependent(component, dep)[source]
insights.core.dr.add_ignore(c, i)[source]
insights.core.dr.add_observer(o, component_type=<class 'insights.core.dr.ComponentType'>)[source]

Add a callback that will get invoked after each component is called.

Parameters:

o (func) -- the callback function

Keyword Arguments:

component_type (ComponentType) -- the ComponentType to observe. The callback will fire any time an instance of the class or its subclasses is invoked.

The callback should look like this:

def callback(comp, broker):
    value = broker.get(comp)
    # do something with value
    pass
insights.core.dr.determine_components(components)[source]
insights.core.dr.first_of(dependencies, broker)[source]
insights.core.dr.generate_incremental(components=None, broker=None)[source]
insights.core.dr.get_base_module_name(obj)[source]
insights.core.dr.get_component(name)[source]

Returns the class or function specified, importing it if necessary.

insights.core.dr.get_component_by_name(name)[source]

Look up a component by its fully qualified name. Return None if the component hasn’t been loaded.

insights.core.dr.get_component_type(component)[source]
insights.core.dr.get_components_of_type(_type)[source]
insights.core.dr.get_delegate(component)[source]
insights.core.dr.get_dependencies(component)[source]
insights.core.dr.get_dependency_graph(component)[source]

Generate a component’s graph of dependencies, which can be passed to run() or run_incremental().

insights.core.dr.get_dependency_specs(component)[source]

Get the dependency specs of the specified component. Only requires and at_least_one specs will be returned. The optional specs is not considered in this function.

Parameters:

component (callable) -- The component to check. The component must already be loaded.

Returns:

The requires and at_least_one spec sets of the component.

Return type:

list

The return list is in the following format:

 [
     requires_1,
     requires_2,
     (at_least_one_11, at_least_one_12),
     (at_least_one_21, [req_alo22, (alo_23, alo_24)]),
 ]

Note:
 - The 'requires_1' and 'requires_2' are `requires` specs.
   Each of them are required.
 - The 'at_least_one_11' and 'at_least_one_12' are `at_least_one`
   specs in the same at least one set.
   At least one of them is required
 - The 'alo_23' and 'alo_24' are `at_least_one` specs and
   together with the 'req_alo22' are `requires` for the
   sub-set. This sub-set specs and the 'at_least_one_21' are
   `at_least_one` specs in the same at least one set.
insights.core.dr.get_dependents(component)[source]
insights.core.dr.get_group(component)[source]

Return the dictionary of links associated with the component. Defaults to dict().

insights.core.dr.get_metadata(component)[source]

Return any metadata dictionary associated with the component. Defaults to an empty dictionary.

insights.core.dr.get_missing_requirements(func, requires, d)[source]

Deprecated since version 1.x.

insights.core.dr.get_module_name(obj)[source]
insights.core.dr.get_name(component)[source]

Attempt to get the string name of component, including module and class if applicable.

insights.core.dr.get_registry_points(component, datasource=None)[source]

Loop through the dependency graph to identify the corresponding spec registry points for the component. This is primarily used by datasources and returns a set. In most cases only one registry point will be included in the set, but in some cases more than one.

Parameters:
  • component (callable) -- The component object

  • datasource (Boolean) -- To avoid infinite recursive calls.

Returns:

A list of the registry points found.

Return type:

(set)

insights.core.dr.get_simple_name(component)[source]
insights.core.dr.get_subgraphs(graph=None)[source]

Given a graph of possibly disconnected components, generate all graphs of connected components. graph is a dictionary of dependencies. Keys are components, and values are sets of components on which they depend.

Return the sub-graphs sorted as per the “prio”.

insights.core.dr.get_tags(component)[source]

Return the set of tags associated with the component. Defaults to set().

insights.core.dr.hashable(v)[source]
insights.core.dr.is_datasource(component)[source]
insights.core.dr.is_enabled(component)[source]

Check to see if a component is enabled.

Parameters:

component (callable) -- The component to check. The component must already be loaded.

Returns:

True if the component is enabled. False otherwise.

insights.core.dr.is_hidden(component)[source]
insights.core.dr.is_registry_point(component)[source]
insights.core.dr.load_components(*paths, **kwargs)[source]

Loads all components on the paths. Each path should be a package or module. All components beneath a path are loaded.

Parameters:

paths (str) -- A package or module to load

Keyword Arguments:
  • include (str) -- A regular expression of packages and modules to include. Defaults to ‘.*’

  • exclude (str) -- A regular expression of packges and modules to exclude. Defaults to ‘test’

  • continue_on_error (bool) -- If True, continue importing even if something raises an ImportError. If False, raise the first ImportError.

Returns:

The total number of modules loaded.

Return type:

int

Raises:

ImportError --

insights.core.dr.mark_hidden(component)[source]
insights.core.dr.observer(component_type=<class 'insights.core.dr.ComponentType'>)[source]

You can use @broker.observer() as a decorator to your callback instead of add_observer().

insights.core.dr.run(components=None, broker=None)[source]

Executes components in an order that satisfies their dependency relationships.

Keyword Arguments:
  • components -- Can be one of a dependency graph, a single component, a component group, or a component type. If it’s anything other than a dependency graph, the appropriate graph is built for you and before evaluation.

  • broker (Broker) -- Optionally pass a broker to use for evaluation. One is created by default, but it’s often useful to seed a broker with an initial dependency.

Returns:

The broker after evaluation.

Return type:

Broker

insights.core.dr.run_all(components=None, broker=None, pool=None)[source]
insights.core.dr.run_components(ordered_components, components, broker)[source]

Runs a list of preordered components using the provided broker.

This function allows callers to order components themselves and cache the result so they don’t incur the toposort overhead on every run.

insights.core.dr.run_incremental(components=None, broker=None)[source]

Executes components in an order that satisfies their dependency relationships. Disjoint subgraphs are executed one at a time and a broker containing the results for each is yielded. If a broker is passed here, its instances are used to seed the broker used to hold state for each sub graph.

Keyword Arguments:
  • components -- Can be one of a dependency graph, a single component, a component group, or a component type. If it’s anything other than a dependency graph, the appropriate graph is built for you and before evaluation.

  • broker (Broker) -- Optionally pass a broker to use for evaluation. One is created by default, but it’s often useful to seed a broker with an initial dependency.

Yields:

Broker -- the broker used to evaluate each subgraph.

insights.core.dr.run_order(graph)[source]

Returns components in an order that satisfies their dependency relationships.

insights.core.dr.set_enabled(component, enabled=True)[source]

Enable a component for evaluation. If set to False, the component is skipped, and all components that require it will not execute.

If component is a fully qualified name string of a callable object instead of the callable object itself, the component’s module is loaded as a side effect of calling this function.

Parameters:
  • component (str or callable) -- fully qualified name of the component or the component object itself.

  • enabled (bool) -- whether the component is enabled for evaluation.

Returns:

None

insights.core.dr.split_requirements(requires)[source]
insights.core.dr.stringify_requirements(requires)[source]
insights.core.dr.walk_dependencies(root, visitor)[source]

Call visitor on root and all dependencies reachable from it in breadth first order.

Parameters:
  • root (component) -- component function or class

  • visitor (function) -- signature is func(component, parent). The call on root is visitor(root, None).

insights.core.dr.walk_tree(root, method=<function get_dependencies>)[source]
class insights.core.dr.ComponentType(*deps, **kwargs)[source]

ComponentType is the base class for all component type decorators.

For Example:

class my_component_type(ComponentType):
    pass

@my_component_type(SshDConfig, InstalledRpms, [ChkConfig, UnitFiles], optional=[IPTables, IpAddr])
def my_func(sshd_config, installed_rpms, chk_config, unit_files, ip_tables, ip_addr):
    return installed_rpms.newest("bash")

Notice that the arguments to my_func correspond to the dependencies in the @my_component_type and are in the same order.

When used, a my_component_type instance is created whose __init__ gets passed dependencies and whose __call__ gets passed the component to run if dependencies are met.

Parameters to the decorator have these forms:

Criteria

Example Decorator Arguments

Description

Required

SshDConfig, InstalledRpms

A regular argument

At Least One

[ChkConfig, UnitFiles]

An argument as a list

Optional

optional=[IPTables, IpAddr]

A list following optional=

If a parameter is required, the value provided for it is guaranteed not to be None. In the example above, sshd_config and installed_rpms will not be None.

At least one of the arguments to parameters of an “at least one” list will not be None. In the example, either or both of chk_config and unit_files will not be None.

Any or all arguments for optional parameters may be None.

The following keyword arguments may be passed to the decorator:

requires

a list of components that all components decorated with this type will implicitly require. Additional components passed to the decorator will be appended to this list.

Type:

list

optional

a list of components that all components decorated with this type will implicitly depend on optionally. Additional components passed as optional to the decorator will be appended to this list.

Type:

list

metadata

an arbitrary dictionary of information to associate with the component you’re decorating. It can be retrieved with get_metadata.

Type:

dict

tags

a list of strings that categorize the component. Useful for formatting output or sifting through results for components you care about.

Type:

list

group

GROUPS.single or GROUPS.cluster. Used to organize components into “groups” that run together with insights.core.dr.run().

cluster

if True will put the component into the GROUPS.cluster group. Defaults to False. Overrides group if True.

Type:

bool

get_missing_dependencies(broker)[source]

Gets required and at-least-one dependencies not provided by the broker.

invoke(results)[source]

Handles invocation of the component. The default implementation invokes it with positional arguments based on order of dependency declaration.

process(broker)[source]

Ensures dependencies have been met before delegating to self.invoke.

insights.core.exceptions

Exceptions

exception insights.core.exceptions.BlacklistedSpec[source]

Bases: Exception

Exception to be thrown when a blacklisted spec is found.

exception insights.core.exceptions.CalledProcessError(returncode, cmd, output=None)[source]

Bases: Exception

Raised if call fails.

Parameters:
  • returncode (int) -- The return code of the process executing the command.

  • cmd (str) -- The command that was executed.

  • output (str) -- Any output the command produced.

exception insights.core.exceptions.ContentException[source]

Bases: SkipComponent

Raised whenever a datasource fails to get data.

exception insights.core.exceptions.InvalidArchive(msg)[source]

Bases: Exception

Raised when archive cannot be identified or missing expected structure.

exception insights.core.exceptions.InvalidContentType(content_type)[source]

Bases: InvalidArchive

Raised when invalid content_type is specified.

exception insights.core.exceptions.MissingRequirements(requirements)[source]

Bases: Exception

Raised during evaluation if a component’s dependencies aren’t met.

exception insights.core.exceptions.NoFilterException[source]

Bases: Exception

Raised whenever no filters added to a filterable datasource.

exception insights.core.exceptions.ParseException[source]

Bases: Exception

Exception that should be thrown from parsers that encounter exceptions they recognize while parsing. When this exception is thrown, the exception message and data are logged and no parser output data is saved.

exception insights.core.exceptions.SkipComponent[source]

Bases: Exception

This class should be raised by components that want to be taken out of dependency resolution.

exception insights.core.exceptions.TimeoutException[source]

Bases: Exception

Raised whenever a datasource hits the set timeout value.

exception insights.core.exceptions.ValidationException(msg, r=None)[source]

Bases: Exception

Raised when passes invalid arguments to Response.

insights.core.filters

The filters module allows developers to apply filters to datasources, by adding them directly or through dependent components like parsers and combiners. A filter is a simple string, and it matches if it is contained anywhere within a line.

If a datasource has filters defined, it will return only lines matching at least one of them. If a datasource has no filters, it will return all lines.

Filters can be added to components like parsers and combiners, to apply consistent filtering to multiple underlying datasources that are configured as filterable.

Filters aren’t applicable to “raw” datasources, which are created with kind=RawFileProvider and have RegistryPoint instances with raw=True.

The addition of a single filter can cause a datasource to change from returning all lines to returning just those that match. Therefore, any filtered datasource should have at least one filter in the commit introducing it so downstream components don’t inadvertently change its behavior.

The benefit of this fragility is the ability to drastically reduce in-memory footprint and archive sizes. An additional benefit is the ability to evaluate only lines known to be free of sensitive information.

Filters added to a RegistryPoint will be applied to all datasources that implement it. Filters added to a datasource implementation apply only to that implementation.

For example, a filter added to Specs.ps_auxww will apply to DefaultSpecs.ps_auxww, InsightsArchiveSpecs.ps_auxww, SosSpecs.ps_auxww, etc. But a filter added to DefaultSpecs.ps_auxww will only apply to DefaultSpecs.ps_auxww. See the modules in insights.specs for those classes.

Filtering can be disabled globally by setting the environment variable INSIGHTS_FILTERS_ENABLED=False. This means that no datasources will be filtered even if filters are defined for them.

insights.core.filters.add_filter(component, patterns, max_match=10000)[source]

Add a filter or list of filters to a component. When the component is a datasource, the filter will be directly added to that datasouce. In cases when the component is a parser or combiner, the filter will be added to underlying filterable datasources by traversing dependency graph. A filter is a simple string, and it matches if it is contained anywhere within a line.

Parameters:
  • component (component) -- The component to filter, can be datasource, parser or combiner.

  • patterns (str, [str]) -- A string, list of strings, or set of strings to add to the datasource’s filters.

  • max_match (int) -- A int, the maximum matched lines to filter out. MAX_MATCH by default.

insights.core.filters.apply_filters(target, lines)[source]

Applys filters to the lines of a datasource. This function is used only in integration tests. Filters are applied in an equivalent but more performant way at run time.

insights.core.filters.dump(stream=None)[source]

Dumps a string representation of FILTERS to a stream, normally an open file. If none is passed, FILTERS is dumped to a default location within the project.

insights.core.filters.dumps()[source]

Returns a string representation of the sorted FILTERS dictionary.

insights.core.filters.get_filters(component, with_matches=False)[source]

Get the set of filters for the given datasource.

Filters added to a RegistryPoint will be applied to all datasources that implement it. Filters added to a datasource implementation apply only to that implementation.

For example, a filter added to Specs.ps_auxww will apply to DefaultSpecs.ps_auxww, InsightsArchiveSpecs.ps_auxww, SosSpecs.ps_auxww, etc. But a filter added to DefaultSpecs.ps_auxww will only apply to DefaultSpecs.ps_auxww. See the modules in insights.specs for those classes.

Parameters:
  • component (a datasource) -- The target datasource

  • with_matches (boolean) -- Needs the max matches being returned? False by default.

Returns:

when with_matches=False, returns the set of filters

defined for the datasource only. when with_matches=True, returns filters defined for the datasource with the max match count specified by add_filter.

Return type:

(set or dict)

insights.core.filters.load(stream=None)[source]

Loads filters from a stream, normally an open file. If one is not passed, filters are loaded from a default location within the project.

insights.core.filters.loads(string)[source]

Loads the filters dictionary given a string.

insights.core.plugins

The plugins module defines the components used by the rest of insights and specializes their interfaces and execution model where required.

This module includes the following CompoentType subclasses:

It also contains the following Response subclasses that rules may return:

class insights.core.plugins.PluginType(*deps, **kwargs)[source]

Bases: ComponentType

PluginType is the base class of plugin types like datasource, rule, etc. It provides a default invoke method that catches exceptions we don’t want bubbling to the top of the evaluation loop. These exceptions are commonly raised by datasource components but could be in the context of any component since most datasource runtime errors are lazy.

It’s possible for a datasource to “succeed” and return an object but for an exception to be raised when the parser tries to access the content of that object. For example, when a command datasource is evaluated, it only checks that the command exists and is executable. Invocation of the command itself is delayed until the parser asks for its value. This helps with performance and memory consumption.

invoke(broker)[source]

Handles invocation of the component. The default implementation invokes it with positional arguments based on order of dependency declaration.

class insights.core.plugins.Response(key, **kwargs)[source]

Bases: dict

Response is the base class of response types that can be returned from rules.

Subclasses must call __init__ of this class via super() and must provide the response_type class attribute.

The key_name class attribute is optional, but if one is specified, the first argument to __init__ must not be None. If key_name is None, then the first argument to __init__ should be None. It’s best to override __init__ in subclasses so users aren’t required to pass None explicitly.

adjust_for_length(key, r, kwargs)[source]

Converts the response to a string and compares its length to a max length specified in settings. If the response is too long, an error is logged, and an abbreviated response is returned instead.

get_key()[source]

Helper function that uses the response’s key_name to look up the response identifier. For a rule, this is like response.get(“error_key”).

key_name = None

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = None

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

validate_key(key)[source]

Called if the key_name class attribute is not None.

validate_kwargs(kwargs)[source]

Validates expected subclass attributes and constructor keyword arguments.

class insights.core.plugins.combiner(*deps, **kwargs)[source]

Bases: PluginType

A decorator for a component that composes or “combines” other components.

A typical use case is hiding slight variations in related parser interfaces. Another use case is to combine several related parsers behind a single, cohesive, higher level interface.

class insights.core.plugins.component(*deps, **kwargs)[source]

Bases: PluginType

class insights.core.plugins.condition(*deps, **kwargs)[source]

Bases: PluginType

ComponentType used to encapsulate boolean logic you’d like to have analyzed by a rule analysis system. Conditions should return truthy values. None is also a valid return type for conditions, so rules that depend on conditions that might return None should check their validity.

class insights.core.plugins.datasource(*deps, **kwargs)[source]

Bases: PluginType

Decorates a component that one or more insights.core.Parser subclasses will consume.

filterable = False
invoke(broker)[source]

Handles invocation of the component. The default implementation invokes it with positional arguments based on order of dependency declaration.

multi_output = False
no_obfuscate = []
no_redact = False
prio = 0
raw = False
class insights.core.plugins.fact(*deps, **kwargs)[source]

Bases: PluginType

ComponentType for a component that surfaces a dictionary or list of dictionaries that will be used later by cluster rules. The data from a fact is converted to a pandas Dataframe

class insights.core.plugins.incident(*deps, **kwargs)[source]

Bases: PluginType

ComponentType for a component used by rules that allows automated statistical analysis.

insights.core.plugins.is_combiner(component)[source]
insights.core.plugins.is_component(obj)[source]
insights.core.plugins.is_datasource(component)[source]
insights.core.plugins.is_parser(component)[source]
insights.core.plugins.is_rule(component)[source]
insights.core.plugins.is_type(component, _type)[source]
class insights.core.plugins.make_fail(key, **kwargs)[source]

Bases: make_response

Returned by a rule to signal that its conditions have been met.

Example:

# completely made up package
buggy = InstalledRpms.from_package("bash-3.4.23-1.el7")

@rule(InstalledRpms)
def report(installed_rpms):
   bash = installed_rpms.newest("bash")
   if bash == buggy:
       return make_fail("BASH_BUG_123", bash=bash)
   return make_pass("BASH", bash=bash)
class insights.core.plugins.make_fingerprint(key, **kwargs)[source]

Bases: Response

key_name = 'fingerprint_key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'fingerprint'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_info(key, **kwargs)[source]

Bases: Response

Returned by a rule to surface information about a system.

Example:

@rule(InstalledRpms)
def report(rpms):
   bash = rpms.newest("bash")
   return make_info("BASH_VERSION", bash=bash.nvra)
key_name = 'info_key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'info'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_metadata(**kwargs)[source]

Bases: Response

Allows a rule to convey addtional metadata about a system to downstream systems. It doesn’t convey success or failure but purely information that may be aggregated with other make_metadata responses. As such, it has no response key.

response_type = 'metadata'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_metadata_key(key, value)[source]

Bases: Response

adjust_for_length(key, r, kwargs)[source]

Converts the response to a string and compares its length to a max length specified in settings. If the response is too long, an error is logged, and an abbreviated response is returned instead.

key_name = 'key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'metadata_key'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_none[source]

Bases: Response

Used to create a response for a rule that returns None

This is not intended to be used by plugins, only infrastructure but it not private so that we can easily add it to reporting.

key_name = 'none_key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'none'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_pass(key, **kwargs)[source]

Bases: Response

Returned by a rule to signal that its conditions explicitly have not been met. In other words, the rule has all of the information it needs to determine that the system it’s analyzing is not in the state the rule was meant to catch.

An example rule might check whether a system is vulnerable to a well defined exploit or has a bug in a specific version of a package. If it can say for sure “the system does not have this exploit” or “the system does not have the buggy version of the package installed”, then it should return an instance of make_pass.

Example:

# completely made up package
buggy = InstalledRpms.from_package("bash-3.4.23-1.el7")

@rule(InstalledRpms)
def report(installed_rpms):
   bash = installed_rpms.newest("bash")
   if bash == buggy:
       return make_fail("BASH_BUG_123", bash=bash)
   return make_pass("BASH", bash=bash)
key_name = 'pass_key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'pass'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.make_response(key, **kwargs)[source]

Bases: Response

Returned by a rule to signal that its conditions have been met.

Example:

# completely made up package
buggy = InstalledRpms.from_package("bash-3.4.23-1.el7")

@rule(InstalledRpms)
def report(installed_rpms):
   bash = installed_rpms.newest("bash")
   if bash == buggy:
       return make_response("BASH_BUG_123", bash=bash)
   return make_pass("BASH", bash=bash)

Deprecated since version 1.x: Use make_fail instead.

key_name = 'error_key'

key_name is something like ‘error_key’, ‘fingerprint_key’, etc. It is the key downstream systems use to look up the exact response returned by a rule.

response_type = 'rule'

response_type is something like ‘rule’, ‘metadata’, ‘fingerprint’, etc. It is how downstream systems identify the type of information returned by a rule.

class insights.core.plugins.metadata(*args, **kwargs)[source]

Bases: parser

Used for old cluster uber-archives.

Deprecated since version 1.x.

Warning

Do not use this component type.

requires = ['metadata.json']

a list of components that all components decorated with this type will implicitly require. Additional components passed to the decorator will be appended to this list.

class insights.core.plugins.parser(*args, **kwargs)[source]

Bases: PluginType

Decorates a component responsible for parsing the output of a datasource. @parser should accept multiple arguments, the first will ALWAYS be the datasource the parser component should handle. Any subsequent argument will be a component used to determine if the parser should fire. @parser should only decorate subclasses of insights.core.Parser.

Warning

If a Parser component handles a datasource that returns a list, a Parser instance will be created for each element of the list. Combiners or rules that depend on the Parser will be passed the list of instances and not a single parser instance. By default, if any parser in the list succeeds, those parsers are passed on to dependents, even if others fail. If all parsers should succeed or fail together, pass continue_on_error=False.

invoke(broker)[source]

Handles invocation of the component. The default implementation invokes it with positional arguments based on order of dependency declaration.

class insights.core.plugins.remoteresource(*deps, **kwargs)[source]

Bases: PluginType

ComponentType for a component for remote web resources.

class insights.core.plugins.rule(*args, **kwargs)[source]

Bases: PluginType

Decorator for components that encapsulate some logic that depends on the data model of a system. Rules can depend on datasource instances, parser instances, combiner instances, or anything else.

For example:

@rule(SshDConfig, InstalledRpms, [ChkConfig, UnitFiles], optional=[IPTables, IpAddr])
def report(sshd_config, installed_rpms, chk_config, unit_files, ip_tables, ip_addr):
    # ...
    # ... some complicated logic
    # ...
    bash = installed_rpms.newest("bash")
    return make_pass("BASH", bash=bash)

Notice that the arguments to report correspond to the dependencies in the @rule decorator and are in the same order.

Parameters to the decorator have these forms:

Criteria

Example Decorator Arguments

Description

Required

SshDConfig, InstalledRpms

Regular arguments

At Least One

[ChkConfig, UnitFiles]

An argument as a list

Optional

optional=[IPTables, IpAddr]

A list following optional=

If a parameter is required, the value provided for it is guaranteed not to be None. In the example above, sshd_config and installed_rpms will not be None.

At least one of the arguments to parameters of an “at least one” list will not be None. In the example, either or both of chk_config and unit_files will not be None.

Any or all arguments for optional parameters may be None.

The following keyword arguments may be passed to the decorator:

Keyword Arguments:
  • requires (list) -- a list of components that all components decorated with this type will require. Instead of using requires=[...], just pass dependencies as variable arguments to @rule as in the example above.

  • optional (list) -- a list of components that all components decorated with this type will implicitly depend on optionally. Additional components passed as optional to the decorator will be appended to this list.

  • metadata (dict) -- an arbitrary dictionary of information to associate with the component you’re decorating. It can be retrieved with get_metadata.

  • tags (list) -- a list of strings that categorize the component. Useful for formatting output or sifting through results for components you care about.

  • group -- GROUPS.single or GROUPS.cluster. Used to organize components into “groups” that run together with insights.core.dr.run().

  • cluster (bool) -- if True will put the component into the GROUPS.cluster group. Defaults to False. Overrides group if True.

  • content (string or dict) -- a jinja2 template or dictionary of jinja2 templates. The Response subclasses rules can return are dictionaries. make_pass, make_fail, and make_response all accept first a key and then a list of arbitrary keyword arguments. If content is a dictionary, the key is used to look up the template that the rest of the keyword argments will be interpolated into. If content is a string, then it is used for all return values of the rule. If content isn’t defined but a CONTENT variable is declared in the module, it will be used for every rule in the module and also can be a string or list of dictionaries

  • links (dict) -- a dictionary with strings as keys and lists of urls as values. The keys categorize the urls, e.g. “kcs” for kcs urls and “bugzilla” for bugzilla urls.

content = None
process(broker)[source]

Ensures dependencies have been met before delegating to self.invoke.

insights.core.remote_resource

class insights.core.remote_resource.CachedRemoteResource[source]

Bases: RemoteResource

RemoteResource subclass that sets up caching for subsequent Web resource requests.

Examples

>>> from insights.core.remote_resource import CachedRemoteResource
>>> crr = CachedRemoteResource()
>>> rtn = crr.get("http://google.com")
>>> print (rtn.content)
backend = 'DictCache'

Type of storage for cache DictCache1, FileCache or RedisCache

Type:

str

expire_after = 180

Amount of time in seconds that the cache will expire

Type:

float

file_cache_path = '.web_cache'

Path to where file cache will be stored if FileCache backend is specified

Type:

str

redis_host = 'localhost'

Hostname of redis instance if RedisCache backend is specified

Type:

str

redis_port = 6379

Port used to contact the redis instance if RedisCache backend is specified

Type:

int

class insights.core.remote_resource.DefaultHeuristic(expire_after)[source]

Bases: BaseHeuristic

BaseHeuristic subclass that sets the default caching headers if not supplied by the remote service.

default_cache_vars = 'Remote service caching headers not set correctly, using default caching'

Message content warning that the response from the remote server did not return proper HTTP cache headers so we will use default cache settings

Type:

str

server_cache_headers = 'Caching being done based on caching headers returned by remote service'

Message content warning that we are using cache settings returned by the remote server.

Type:

str

update_headers(response)[source]

Returns the updated caching headers.

Parameters:

response (HttpResponse) -- The response from the remote service

Returns:

(HttpResponse.Headers): Http caching headers

Return type:

response

warning(response)[source]

Return a valid 1xx warning header value describing the cache adjustments.

The response is provided too allow warnings like 113 http://tools.ietf.org/html/rfc7234#section-5.5.4 where we need to explicitly say response is over 24 hours old.

class insights.core.remote_resource.RemoteResource(session=None)[source]

Bases: object

RemoteResource class for accessing external Web resources.

Examples

>>> from insights.core.remote_resource import RemoteResource
>>> rr = RemoteResource()
>>> rtn = rr.get("http://google.com")
>>> print (rtn.content)
get(url, params={}, headers={}, auth=(), certificate_path=None)[source]

Returns the response payload from the request to the given URL.

Parameters:
  • url (str) -- The URL for the WEB API that the request is being made too.

  • params (dict) -- Dictionary containing the query string parameters.

  • headers (dict) -- HTTP Headers that may be needed for the request.

  • auth (tuple) -- User ID and password for Basic Auth

  • certificate_path (str) -- Path to the ssl certificate.

Returns:

(HttpResponse): Response object from requests.get api request

Return type:

response

timeout = 10

Time in seconds for the requests.get api call to wait before returning a timeout exception

Type:

float

insights.core.spec_factory

class insights.core.spec_factory.CommandOutputProvider(cmd, ctx, root='insights_commands', save_as=None, args=None, split=True, keep_rc=False, ds=None, timeout=None, inherit_env=None, override_env=None, signum=None, cleaner=None)[source]

Bases: ContentProvider

Class used in datasources to return output from commands.

create_args()[source]
create_env()[source]
load()[source]
validate()[source]
class insights.core.spec_factory.ContainerCommandProvider(cmd_path, ctx, image=None, args=None, split=True, keep_rc=False, ds=None, timeout=None, inherit_env=None, override_env=None, signum=None, cleaner=None)[source]

Bases: ContainerProvider

class insights.core.spec_factory.ContainerFileProvider(cmd_path, ctx, image=None, args=None, split=True, keep_rc=False, ds=None, timeout=None, inherit_env=None, override_env=None, signum=None, cleaner=None)[source]

Bases: ContainerProvider

class insights.core.spec_factory.ContainerProvider(cmd_path, ctx, image=None, args=None, split=True, keep_rc=False, ds=None, timeout=None, inherit_env=None, override_env=None, signum=None, cleaner=None)[source]

Bases: CommandOutputProvider

class insights.core.spec_factory.ContentProvider[source]

Bases: object

property content
load()[source]
property path
stream()[source]

Returns a generator of lines instead of a list of lines.

write(dst)[source]
class insights.core.spec_factory.DatasourceProvider(content, relative_path, root='/', save_as=None, ds=None, ctx=None, cleaner=None, no_obfuscate=None, no_redact=False)[source]

Bases: ContentProvider

load()[source]
class insights.core.spec_factory.FileProvider(relative_path, root='/', save_as=None, ds=None, ctx=None, cleaner=None)[source]

Bases: ContentProvider

validate()[source]
class insights.core.spec_factory.RawFileProvider(relative_path, root='/', save_as=None, ds=None, ctx=None, cleaner=None)[source]

Bases: FileProvider

Class used in datasources that returns the contents of a file a single string. The file is not filtered/obfuscated/redacted.

load()[source]
write(dst)[source]
class insights.core.spec_factory.RegistryPoint(metadata=None, multi_output=False, raw=False, filterable=False, no_obfuscate=None, no_redact=False, prio=0)[source]

Bases: object

insights.core.spec_factory.SAFE_ENV = {'LANG': 'C.UTF-8', 'LC_ALL': 'C', 'PATH': '/bin:/usr/bin:/sbin:/usr/sbin:/usr/share/Modules/bin'}

A minimal set of environment variables for use in subprocess calls

class insights.core.spec_factory.SerializedOutputProvider(relative_path, root='/', save_as=None, ds=None, ctx=None, cleaner=None)[source]

Bases: TextFileProvider

class insights.core.spec_factory.SerializedRawOutputProvider(relative_path, root='/', save_as=None, ds=None, ctx=None, cleaner=None)[source]

Bases: RawFileProvider

class insights.core.spec_factory.SpecDescriptor(func)[source]

Bases: object

class insights.core.spec_factory.SpecSet[source]

Bases: object

The base class for all spec declarations. Extend this class and define your datasources directly or with a SpecFactory.

context_handlers = {}
registry = {}
class insights.core.spec_factory.SpecSetMeta(name, bases, dct)[source]

Bases: type

The metaclass that converts RegistryPoint markers to registry point datasources and hooks implementations for them into the registry.

class insights.core.spec_factory.TextFileProvider(relative_path, root='/', save_as=None, ds=None, ctx=None, cleaner=None)[source]

Bases: FileProvider

Class used in datasources that returns the contents of a file a list of lines. Each line is filtered if filters are defined for the datasource.

create_args()[source]

The “grep” is faster and can be used shrink the size of file.

load()[source]
class insights.core.spec_factory.command_with_args(cmd, provider, save_as=None, context=<class 'insights.core.context.HostContext'>, deps=None, split=True, keep_rc=False, timeout=None, inherit_env=None, override_env=None, signum=None, **kwargs)[source]

Bases: object

Execute a command that has dynamic arguments

Parameters:
  • cmd (str) -- the command to execute. Breaking apart a command string that might require arguments.

  • provider (str or tuple) -- argument string or a tuple of argument strings.

  • save_as (str or None) -- path to save the collected file as. - It should be a relative path in which any starting and ending ‘/’ will be removed, the collected file will be renamed to save_as under the ‘insights_commands’ directory.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • split (bool) -- whether the output of the command should be split into a list of lines

  • keep_rc (bool) -- whether to return the error code returned by the process executing the command. If False, any return code other than zero with raise a CalledProcessError. If True, the return code and output are always returned.

  • timeout (int) -- Number of seconds to wait for the command to complete. If the timeout is reached before the command returns, a CalledProcessError is raised. If None, timeout is infinite.

  • inherit_env (list) -- The list of environment variables to inherit from the calling process when the command is invoked.

  • override_env (dict) -- A dict of environment variables to override from the calling process when the command is invoked.

Returns:

A datasource that returns the output of a command that takes

specified arguments passed by the provider.

Return type:

function

class insights.core.spec_factory.container_collect(provider, path=None, context=<class 'insights.core.context.HostContext'>, deps=None, split=True, keep_rc=False, timeout=None, inherit_env=None, override_env=None, signum=None, **kwargs)[source]

Bases: foreach_execute

Collects the files at the resulting path in running containers.

Parameters:
  • provider (list) -- a list of tuples.

  • path (str) -- the file path template with substitution parameters. The path can also be passed via the provider when it’s variable per cases, in that case, the path should be None.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • keep_rc (bool) -- whether to return the error code returned by the process executing the command. If False, any return code other than zero with raise a CalledProcessError. If True, the return code and output are always returned.

  • timeout (int) -- Number of seconds to wait for the command to complete. If the timeout is reached before the command returns, a CalledProcessError is raised. If None, timeout is infinite.

Returns:

A datasource that returns a list of file contents created by

substituting each element of provider into the path template.

Return type:

function

class insights.core.spec_factory.container_execute(provider, cmd, context=<class 'insights.core.context.HostContext'>, deps=None, split=True, keep_rc=False, timeout=None, inherit_env=None, override_env=None, signum=None, **kwargs)[source]

Bases: foreach_execute

Execute a command for each element in provider in container. Provider is the output of a different datasource that returns a list of tuples. In each tuple, the container engine provider (“podman” or “docker”) and the container_id are two required elements, the rest elements if there are, are the arguments being passed to the command.

Parameters:
  • provider (list) -- a list of tuples, in each tuple, the container engine provider (“podman” or “docker”) and the container_id are two required elements, the rest elements if there are, are the arguments being passed to the cmd.

  • cmd (str) -- a command with substitution parameters. Breaking apart a command string that might contain multiple commands separated by a pipe, getting them ready for subproc operations. IE. A command with filters applied

  • context (ExecutionContext) -- the context under which the datasource should run.

  • split (bool) -- whether the output of the command should be split into a list of lines

  • keep_rc (bool) -- whether to return the error code returned by the process executing the command. If False, any return code other than zero with raise a CalledProcessError. If True, the return code and output are always returned.

  • timeout (int) -- Number of seconds to wait for the command to complete. If the timeout is reached before the command returns, a CalledProcessError is raised. If None, timeout is infinite.

  • inherit_env (list) -- The list of environment variables to inherit from the calling process when the command is invoked.

Returns:

A datasource that returns a list of outputs for each command

created by substituting each element of provider into the cmd template.

Return type:

function

insights.core.spec_factory.deserialize_command_output(_type, data, root, ctx, ds)[source]
insights.core.spec_factory.deserialize_container_command(_type, data, root, ctx, ds)[source]
insights.core.spec_factory.deserialize_container_file(_type, data, root, ctx, ds)[source]
insights.core.spec_factory.deserialize_datasource_provider(_type, data, root, ctx, ds)[source]
insights.core.spec_factory.deserialize_raw_file_provider(_type, data, root, ctx, ds)[source]
insights.core.spec_factory.deserialize_text_provider(_type, data, root, ctx, ds)[source]
class insights.core.spec_factory.find(spec, pattern)[source]

Bases: object

Helper class for extracting specific lines from a datasource for direct consumption by a rule.

service_starts = find(Specs.audit_log, "SERVICE_START")

@rule(service_starts)
def report(starts):
    return make_info("SERVICE_STARTS", num_starts=len(starts))
Parameters:
  • spec (datasource) -- some datasource, ideally filterable.

  • pattern (string / list) -- a string or list of strings to match (no patterns supported)

Returns:

A dict where each key is a command, path, or spec name, and each value is a non-empty list of matching lines. Only paths with matching lines are included.

Raises:

SkipComponent -- if no paths have matching lines.

class insights.core.spec_factory.first_file(paths, save_as=None, context=None, deps=None, kind=<class 'insights.core.spec_factory.TextFileProvider'>, **kwargs)[source]

Bases: object

Creates a datasource that returns the first existing and readable file in files.

Parameters:
  • paths (str) -- list of paths to find and collect.

  • save_as (str or None) -- path to save the collected file as. - It should be a relative path and any starting ‘/’ will be removed. - If it’s a path which ending with ‘/’, the collected file will be stored to the “save_as” directory, - If it’s a path which not ending with ‘/’, the collected file will be renamed to the file with “save_as” as the full path.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • kind (FileProvider) -- One of TextFileProvider or RawFileProvider.

Returns:

A datasource that returns the first file in files that exists

and is readable

Return type:

function

class insights.core.spec_factory.first_of(deps)[source]

Bases: object

Given a list of dependencies, returns the first of the list that exists in the broker. At least one must be present, or this component won’t fire.

class insights.core.spec_factory.foreach_collect(provider, path, save_as=None, ignore=None, context=<class 'insights.core.context.HostContext'>, deps=None, kind=<class 'insights.core.spec_factory.TextFileProvider'>, **kwargs)[source]

Bases: object

Subtitutes each element in provider into path and collects the files at the resulting paths.

Parameters:
  • provider (list) -- a list of elements or tuples.

  • save_as (str or None) -- directory path to save the collected files as. - It should be a relative path and any starting ‘/’ will be removed and an ending ‘/’ will be added.

  • path (str) -- a path template with substitution parameters.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • kind (FileProvider) -- one of TextFileProvider or RawFileProvider

Returns:

A datasource that returns a list of file contents created by

substituting each element of provider into the path template.

Return type:

function

class insights.core.spec_factory.foreach_execute(provider, cmd, context=<class 'insights.core.context.HostContext'>, deps=None, split=True, keep_rc=False, timeout=None, inherit_env=None, override_env=None, signum=None, **kwargs)[source]

Bases: object

Execute a command for each element in provider. Provider is the output of a different datasource that returns a list of single elements or a list of tuples. The command should have %s substitution parameters equal to the number of elements in each tuple of the provider.

Parameters:
  • provider (list) -- a list of elements or tuples.

  • cmd (str) -- a command with substitution parameters. Breaking apart a command string that might contain multiple commands separated by a pipe, getting them ready for subproc operations. IE. A command with filters applied

  • context (ExecutionContext) -- the context under which the datasource should run.

  • split (bool) -- whether the output of the command should be split into a list of lines

  • keep_rc (bool) -- whether to return the error code returned by the process executing the command. If False, any return code other than zero with raise a CalledProcessError. If True, the return code and output are always returned.

  • timeout (int) -- Number of seconds to wait for the command to complete. If the timeout is reached before the command returns, a CalledProcessError is raised. If None, timeout is infinite.

  • inherit_env (list) -- The list of environment variables to inherit from the calling process when the command is invoked.

  • override_env (dict) -- A dict of environment variables to override from the calling process when the command is invoked.

Returns:

A datasource that returns a list of outputs for each command

created by substituting each element of provider into the cmd template.

Return type:

function

class insights.core.spec_factory.glob_file(patterns, save_as=None, ignore=None, context=None, deps=None, kind=<class 'insights.core.spec_factory.TextFileProvider'>, max_files=1000, **kwargs)[source]

Bases: object

Creates a datasource that reads all files matching the glob pattern(s).

Parameters:
  • patterns (str or [str]) -- glob pattern(s) of paths to collect.

  • save_as (str or None) -- directory path to save the collected files as. - It should be a relative path and any starting ‘/’ will be removed and an ending ‘/’ will be added.

  • ignore (regex) -- a regular expression that is used to filter the paths matched by pattern(s).

  • context (ExecutionContext) -- the context under which the datasource should run.

  • kind (FileProvider) -- One of TextFileProvider or RawFileProvider.

  • max_files (int) -- Maximum number of glob files to process.

Returns:

A datasource that reads all files matching the glob patterns.

Return type:

function

class insights.core.spec_factory.head(dep, **kwargs)[source]

Bases: object

Return the first element of any datasource that produces a list.

class insights.core.spec_factory.listdir(path, context=None, ignore=None, deps=None)[source]

Bases: object

Execute a simple directory listing of all the files and directories in path.

Parameters:
  • path (str) -- directory to list.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • ignore (str) -- regular expression defining names to ignore.

Returns:

A datasource that returns a sorted list of file and directory

names in the directory specified by path. The list will be empty when the directory is empty or all names get ignored.

Return type:

function

class insights.core.spec_factory.listglob(path, context=None, ignore=None, deps=None)[source]

Bases: listdir

List paths matching a glob pattern.

Parameters:
  • pattern (str) -- glob pattern to list.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • ignore (str) -- regular expression defining paths to ignore.

Returns:

A datasource that returns the list of paths that match

the given glob pattern. The list will be empty when nothing matches.

Return type:

function

insights.core.spec_factory.serialize_command_output(obj, root)[source]
insights.core.spec_factory.serialize_container_command(obj, root)[source]
insights.core.spec_factory.serialize_container_file_output(obj, root)[source]
insights.core.spec_factory.serialize_datasource_provider(obj, root)[source]
insights.core.spec_factory.serialize_raw_file_provider(obj, root)[source]
insights.core.spec_factory.serialize_text_file_provider(obj, root)[source]
class insights.core.spec_factory.simple_command(cmd, save_as=None, context=<class 'insights.core.context.HostContext'>, deps=None, split=True, keep_rc=False, timeout=None, inherit_env=None, override_env=None, signum=None, **kwargs)[source]

Bases: object

Execute a simple command that has no dynamic arguments

Parameters:
  • cmd (str) -- the command(s) to execute. Breaking apart a command string that might contain multiple commands separated by a pipe, getting them ready for subproc operations. IE. A command with filters applied

  • save_as (str or None) -- path to save the collected file as. - It should be a relative path in which any starting and ending ‘/’ will be removed, the collected file will be renamed to save_as under the ‘insights_commands’ directory.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • split (bool) -- whether the output of the command should be split into a list of lines

  • keep_rc (bool) -- whether to return the error code returned by the process executing the command. If False, any return code other than zero with raise a CalledProcessError. If True, the return code and output are always returned.

  • timeout (int) -- Number of seconds to wait for the command to complete. If the timeout is reached before the command returns, a CalledProcessError is raised. If None, timeout is infinite.

  • inherit_env (list) -- The list of environment variables to inherit from the calling process when the command is invoked.

  • override_env (dict) -- A dict of environment variables to override from the calling process when the command is invoked.

Returns:

A datasource that returns the output of a command that takes

no arguments

Return type:

function

class insights.core.spec_factory.simple_file(path, save_as=None, context=None, deps=None, kind=<class 'insights.core.spec_factory.TextFileProvider'>, **kwargs)[source]

Bases: object

Creates a datasource that reads the file at path when evaluated.

Parameters:
  • path (str) -- path to the file to collect.

  • save_as (str or None) -- path to save the collected file as. - It should be a relative path and any starting ‘/’ will be removed. - If it’s a path which ending with ‘/’, the collected file will be stored to the “save_as” directory, - If it’s a path which not ending with ‘/’, the collected file will be renamed to the file with “save_as” as the full path.

  • context (ExecutionContext) -- the context under which the datasource should run.

  • kind (FileProvider) -- One of TextFileProvider or RawFileProvider.

Returns:

A datasource that reads all files matching the glob patterns.

Return type:

function

insights.core.taglang

Simple language for defining predicates against a list or set of strings.

Operator Precedence:
  • ! high - opposite truth value of its predicate

  • / high - starts a regex that continues until whitespace unless quoted

  • & medium - “and” of two predicates

  • | low - “or” of two predicates

  • , low - “or” of two predicates. Synonym for |.

It supports grouping with parentheses and quoted strings/regexes surrounded with either single or double quotes.

Examples

>>> pred = parse("a | b & !c")  # means (a or (b and (not c)))
>>> pred(["a"])
True
>>> pred(["b"])
True
>>> pred(["b", "c"])
False
>>> pred = parse("/net | apache")
>>> pred(["networking"])
True
>>> pred(["mynetwork"])
True
>>> pred(["apache"])
True
>>> pred(["security"])
False
>>> pred = parse("(a | b) & c")
>>> pred(["a", "c"])
True
>>> pred(["b", "c"])
True
>>> pred(["a"])
False
>>> pred(["b"])
False
>>> pred(["c"])
False

Regular expressions start with a forward slash / and continue until whitespace unless they are quoted with either single or double quotes. This means that they can consume what would normally be considered an operator or a closing parenthesis if you aren’t careful.

For example, this is a parse error because the regex consumes the comma:
>>> pred = parse("/net, apache")
Exception
Instead, do this:
>>> pred = parse("/net , apache")
or this:
>>> pred = parse("/net | apache")
or this:
>>> pred = parse("'/net', apache")
class insights.core.taglang.And(left, right)[source]

Bases: Predicate

The values must satisfy both the left and the right condition.

test(value)[source]
class insights.core.taglang.Eq(value)[source]

Bases: Predicate

The value must be in the set of values.

test(values)[source]
class insights.core.taglang.Not(pred)[source]

Bases: Predicate

The values must not satisfy the wrapped condition.

test(value)[source]
class insights.core.taglang.Or(left, right)[source]

Bases: Predicate

The values must satisfy either the left or the right condition.

test(value)[source]
class insights.core.taglang.Predicate[source]

Bases: object

Provides __call__ for invoking the Predicate like a function without having to explictly call its test method.

class insights.core.taglang.Regex(value)[source]

Bases: Predicate

The regex must match at least one of the values.

test(values)[source]
insights.core.taglang.negate(args)[source]
insights.core.taglang.oper(args)[source]

insights.parsers

insights.parsers.calc_of