Ben Chuanlong Du's Blog

It is never too late to learn.

Tips on Sqlfluff

Installation

In [2]:
pip3 install -U sqlfluff
Collecting sqlfluff
  Downloading sqlfluff-0.5.2-py3-none-any.whl (278 kB)
     |████████████████████████████████| 278 kB 2.9 MB/s 
Requirement already satisfied: pathspec in /usr/local/lib/python3.9/site-packages (from sqlfluff) (0.8.1)
Collecting pytest
  Downloading pytest-6.2.3-py3-none-any.whl (280 kB)
     |████████████████████████████████| 280 kB 42.1 MB/s 
Collecting cached-property
  Using cached cached_property-1.5.2-py2.py3-none-any.whl (7.6 kB)
Collecting configparser
  Downloading configparser-5.0.2-py3-none-any.whl (19 kB)
Requirement already satisfied: Jinja2 in /usr/local/lib/python3.9/site-packages (from sqlfluff) (2.11.3)
Collecting typing-extensions
  Using cached typing_extensions-3.7.4.3-py3-none-any.whl (22 kB)
Collecting diff-cover>=2.5.0
  Downloading diff_cover-5.0.1-py3-none-any.whl (44 kB)
     |████████████████████████████████| 44 kB 11.3 MB/s 
Collecting colorama>=0.3
  Using cached colorama-0.4.4-py2.py3-none-any.whl (16 kB)
Collecting bench-it
  Using cached bench_it-1.0.1-py2.py3-none-any.whl (19 kB)
Collecting oyaml
  Using cached oyaml-1.0-py2.py3-none-any.whl (3.0 kB)
Requirement already satisfied: click>=7.1 in /usr/local/lib/python3.9/site-packages (from sqlfluff) (7.1.2)
Collecting appdirs
  Using cached appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
Requirement already satisfied: chardet>=3.0.0 in /usr/local/lib/python3.9/site-packages (from diff-cover>=2.5.0->sqlfluff) (4.0.0)
Collecting jinja2-pluralize
  Using cached jinja2_pluralize-0.3.0-py2.py3-none-any.whl (4.8 kB)
Collecting pluggy
  Using cached pluggy-0.13.1-py2.py3-none-any.whl (18 kB)
Requirement already satisfied: pygments in /usr/local/lib/python3.9/site-packages (from diff-cover>=2.5.0->sqlfluff) (2.7.2)
Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.9/site-packages (from Jinja2->sqlfluff) (1.1.1)
Collecting inflect>=0.2.4
  Downloading inflect-5.3.0-py3-none-any.whl (32 kB)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.9/site-packages (from oyaml->sqlfluff) (5.4.1)
Requirement already satisfied: toml in /usr/local/lib/python3.9/site-packages (from pytest->sqlfluff) (0.10.2)
Requirement already satisfied: attrs>=19.2.0 in /usr/local/lib/python3.9/site-packages (from pytest->sqlfluff) (20.3.0)
Collecting iniconfig
  Downloading iniconfig-1.1.1-py2.py3-none-any.whl (5.0 kB)
Requirement already satisfied: packaging in /usr/local/lib/python3.9/site-packages (from pytest->sqlfluff) (20.9)
Collecting py>=1.8.2
  Downloading py-1.10.0-py2.py3-none-any.whl (97 kB)
     |████████████████████████████████| 97 kB 12.8 MB/s 
Requirement already satisfied: pyparsing>=2.0.2 in /usr/local/lib/python3.9/site-packages (from packaging->pytest->sqlfluff) (2.4.7)
Installing collected packages: inflect, py, pluggy, jinja2-pluralize, iniconfig, typing-extensions, pytest, oyaml, diff-cover, configparser, colorama, cached-property, bench-it, appdirs, sqlfluff
Successfully installed appdirs-1.4.4 bench-it-1.0.1 cached-property-1.5.2 colorama-0.4.4 configparser-5.0.2 diff-cover-5.0.1 inflect-5.3.0 iniconfig-1.1.1 jinja2-pluralize-0.3.0 oyaml-1.0 pluggy-0.13.1 py-1.10.0 pytest-6.2.3 sqlfluff-0.5.2 typing-extensions-3.7.4.3

General Tips and Traps

  1. SQLfluff supports Jinja template!

Safe to fix

  1. L001: Unneccessary trailing whitespace.
  2. L008: Commas should be followed by a single whitespace unless followed by a comment.

Ignore

L: 75 | P: 5 | LXR | Unable to lex characters: ''${candidat'...'

Parsing Error

  1. PRS: Found unparsable section: '-- /*Select list of users to choose from...'

Configuration

  1. .sqlfluff
  1. .sqlfluffignore

  2. You can customize linting and fixing of SQL files by customizing rules. Please refer to Rules Reference for a complete list of rules.

Command-line APIs

In [3]:
!sqlfluff lint --help
Usage: sqlfluff lint [OPTIONS] [PATHS]...

  Lint SQL files via passing a list of files or using stdin.

  PATH is the path to a sql file or directory to lint. This can be either a
  file ('path/to/file.sql'), a path ('directory/of/sql/files'), a single
  ('-') character to indicate reading from *stdin* or a dot/blank ('.'/' ')
  which will be interpreted like passing the current working directory as a
  path argument.

  Linting SQL files:

      sqlfluff lint path/to/file.sql     sqlfluff lint
      directory/of/sql/files

  Linting a file via stdin (note the lone '-' character):

      cat path/to/file.sql | sqlfluff lint -     echo 'select col from tbl'
      | sqlfluff lint -

Options:
  -n, --nocolor                   No color - if this is set then the output
                                  will be without ANSI color codes.

  -v, --verbose                   Verbosity, how detailed should the output
                                  be. This is *stackable*, so `-vv` is more
                                  verbose than `-v`. For the most verbose
                                  option try `-vvvv` or `-vvvvv`.

  --version                       Show the version and exit.
  --logger [parser|linter|rules]  Choose to limit the logging to one of the
                                  loggers.

  --bench                         Set this flag to engage the benchmarking
                                  tool output.

  --ignore TEXT                   Ignore particular families of errors so that
                                  they don't cause a failed run. For example
                                  `--ignore parsing` would mean that any
                                  parsing errors are ignored and don't
                                  influence the success or fail of a run.
                                  Multiple options are possible if comma
                                  separated e.g. `--ignore
                                  parsing,templating`.

  --exclude-rules TEXT            Exclude specific rules. For example
                                  specifying `--exclude-rules L001` will
                                  remove rule `L001` (Unnecessary trailing
                                  whitespace) from the set of considered
                                  rules. This could either be the whitelist,
                                  or the general set if there is no specific
                                  whitelist. Multiple rules can be specified
                                  with commas e.g. `--exclude-rules L001,L002`
                                  will exclude violations of rule `L001` and
                                  rule `L002`.

  --rules TEXT                    Narrow the search to only specific rules.
                                  For example specifying `--rules L001` will
                                  only search for rule `L001` (Unnecessary
                                  trailing whitespace). Multiple rules can be
                                  specified with commas e.g. `--rules
                                  L001,L002` will specify only looking for
                                  violations of rule `L001` and rule `L002`.

  --templater TEXT                The templater to use (default=jinja)
  --dialect TEXT                  The dialect of SQL to lint (default=ansi)
  -f, --format [human|json|yaml]  What format to return the lint result in.
  --nofail                        If set, the exit code will always be zero,
                                  regardless of violations found. This is
                                  potentially useful during rollout.

  --disregard-sqlfluffignores     Perform the operation regardless of
                                  .sqlfluffignore configurations

  --help                          Show this message and exit.
In [ ]:
sqlfluff lint test.sql
In [ ]:
sqlfluff fix test.sql

Customized Fix Rules for sqlfluff

Below is my customized fix rules for sqlfluff. It uses upper case for SQL keywords and lower case for identifiers.

[sqlfluff]
verbose = 0
nocolor = False
dialect = ansi
templater = jinja
rules = None
exclude_rules = None
recurse = 0
output_line_length = 80
runaway_limit = 10
ignore_templated_areas = True
# Comma separated list of file extensions to lint.

# NB: This config will only apply in the root folder.
sql_file_exts = .sql,.sql.j2,.dml,.ddl

[sqlfluff:indentation]
indented_joins = False
template_blocks_indent = True

[sqlfluff:templater]
unwrap_wrapped_queries = True

[sqlfluff:templater:jinja]
apply_dbt_builtins = True

[sqlfluff:templater:jinja:macros]
# Macros provided as builtins for dbt projects
dbt_ref = {% macro ref(model_ref) %}{{model_ref}}{% endmacro %}
dbt_source = {% macro source(source_name, table) %}{{source_name}}_{{table}}{% endmacro %}
dbt_config = {% macro config() %}{% for k in kwargs %}{% endfor %}{% endmacro %}
dbt_var = {% macro var(variable) %}item{% endmacro %}
dbt_is_incremental = {% macro is_incremental() %}True{% endmacro %}

# Some rules can be configured directly from the config common to other rules.
[sqlfluff:rules]
tab_space_size = 4
max_line_length = 80
indent_unit = space
comma_style = trailing
allow_scalar = True
single_table_references = consistent
unquoted_identifiers_policy = all

# Some rules have their own specific config.
[sqlfluff:rules:L003]
lint_templated_tokens = True

[sqlfluff:rules:L010]  # Keywords
capitalisation_policy = upper

[sqlfluff:rules:L014]  # Unquoted identifiers
extended_capitalisation_policy = lower

[sqlfluff:rules:L016]
ignore_comment_lines = False

[sqlfluff:rules:L029]  # Keyword identifiers
unquoted_identifiers_policy = aliases

[sqlfluff:rules:L030]  # Function names
capitalisation_policy = lower

[sqlfluff:rules:L038]
select_clause_trailing_comma = forbid

[sqlfluff:rules:L040]  # Null & Boolean Literals
capitalisation_policy = upper

[sqlfluff:rules:L042]
# By default, allow subqueries in from clauses, but not join clauses.
forbid_subquery_in = join

[sqlfluff:rules:L047]  # Consistent syntax to count all rows
prefer_count_1 = False

Python APIs

In [1]:
import sqlfluff

Parse a quite complex query.

In [36]:
sqlfluff.parse("select c1 from db.t1").tree.to_tuple()
Out[36]:
('file',
 (('statement',
   (('select_statement',
     (('select_clause',
       (('keyword', ()),
        ('whitespace', ()),
        ('select_clause_element',
         (('column_reference', (('identifier', ()),)),)))),
      ('whitespace', ()),
      ('from_clause',
       (('keyword', ()),
        ('whitespace', ()),
        ('from_expression',
         (('from_expression_element',
           (('table_expression',
             (('table_reference',
               (('identifier', ()),
                ('dot', ()),
                ('identifier', ()))),)),)),)))))),)),))
In [2]:
sql = """
    WITH foo AS (
        SELECT * FROM bar.bar
    ),
    baz AS (
        SELECT * FROM bap
    )
    SELECT 
        * 
    FROM 
        foo
    INNER JOIN 
        baz 
    USING (
        user_id
    )
    INNER JOIN 
        ban 
    USING (
        user_id
    )
    """
parsed = sqlfluff.parse(sql)
In [3]:
type(parsed)
Out[3]:
sqlfluff.core.linter.ParsedString
In [5]:
parsed.count("table_reference")
Out[5]:
0
In [6]:
parsed.tree.get_table_references()
Out[6]:
{'ban', 'bap', 'bar.bar'}
In [18]:
parsed.index("SELECT")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-18-9dd70820bf23> in <module>
----> 1 parsed.index("SELECT")

ValueError: tuple.index(x): x not in tuple
In [22]:
parsed.time_dict
Out[22]:
{'templating': 0.005710315000001742,
 'lexing': 0.004631195999998283,
 'parsing': 0.031160859000003427}
In [24]:
parsed.tree
Out[24]:
<FileSegment: ([0](1, 1, 1))>
In [27]:
parsed.tree.allow_empty
Out[27]:
True
In [30]:
?parsed.tree.as_record
Signature: parsed.tree.as_record(**kwargs)
Docstring:
Return the segment as a structurally simplified record.

This is useful for serialization to yaml or json.
kwargs passed to to_tuple
File:      /usr/local/lib/python3.9/site-packages/sqlfluff/core/parser/segments/base.py
Type:      method
In [25]:
dir(parsed.tree)
Out[25]:
['__annotations__',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_comments',
 '_is_expandable',
 '_name',
 '_non_comments',
 '_preface',
 '_realign_segments',
 '_reconstruct',
 '_suffix',
 'allow_empty',
 'apply_fixes',
 'as_record',
 'can_start_end_non_code',
 'comment_seperate',
 'expand',
 'get_child',
 'get_children',
 'get_end_pos_marker',
 'get_start_pos_marker',
 'get_table_references',
 'invalidate_caches',
 'is_code',
 'is_comment',
 'is_expandable',
 'is_meta',
 'is_optional',
 'is_raw',
 'is_segment',
 'is_type',
 'is_whitespace',
 'iter_patches',
 'iter_raw_seg',
 'iter_segments',
 'iter_unparsables',
 'match',
 'match_grammar',
 'matched_length',
 'name',
 'optional',
 'parse',
 'parse_grammar',
 'path_to',
 'pos_marker',
 'raw',
 'raw_list',
 'raw_upper',
 'realign',
 'recursive_crawl',
 'segments',
 'segs_to_tuple',
 'select_children',
 'simple',
 'stringify',
 'structural_simplify',
 'to_tuple',
 'trim_chars',
 'trim_start',
 'type',
 'type_set',
 'validate_segments']
In [43]:
parsed.tree.segments
Out[43]:
(<newline_RawSegment: ([0](1, 1, 1)) '\n'>,
 <whitespace_RawSegment: ([1](1, 2, 1)) '    '>,
 <StatementSegment: ([5](1, 2, 5))>,
 <newline_RawSegment: ([271](1, 21, 6)) '\n'>,
 <whitespace_RawSegment: ([272](1, 22, 1)) '    '>)
In [39]:
print(parsed.tree.stringify())
[0](1, 1, 1)        |file:
[0](1, 1, 1)        |    newline:                                                  '\n'
[1](1, 2, 1)        |    whitespace:                                               '    '
[5](1, 2, 5)        |    statement:
[5](1, 2, 5)        |        with_compound_statement:
[5](1, 2, 5)        |            keyword:                                          'WITH'
[9](1, 2, 9)        |            whitespace:                                       ' '
[10](1, 2, 10)      |            common_table_expression:
[10](1, 2, 10)      |                identifier:                                   'foo'
[13](1, 2, 13)      |                whitespace:                                   ' '
[14](1, 2, 14)      |                keyword:                                      'AS'
[16](1, 2, 16)      |                whitespace:                                   ' '
[17](1, 2, 17)      |                start_bracket:                                '('
[18](1, 2, 18)      |                [META] indent:
[18](1, 2, 18)      |                newline:                                      '\n'
[19](1, 3, 1)       |                whitespace:                                   '        '
[27](1, 3, 9)       |                select_statement:
[27](1, 3, 9)       |                    select_clause:
[27](1, 3, 9)       |                        keyword:                              'SELECT'
[33](1, 3, 15)      |                        [META] indent:
[33](1, 3, 15)      |                        whitespace:                           ' '
[34](1, 3, 16)      |                        select_clause_element:
[34](1, 3, 16)      |                            wildcard_expression:
[34](1, 3, 16)      |                                wildcard_identifier:
[34](1, 3, 16)      |                                    star:                     '*'
[35](1, 3, 17)      |                    whitespace:                               ' '
[36](1, 3, 18)      |                    [META] dedent:
[36](1, 3, 18)      |                    from_clause:
[36](1, 3, 18)      |                        keyword:                              'FROM'
[40](1, 3, 22)      |                        whitespace:                           ' '
[41](1, 3, 23)      |                        from_expression:
[41](1, 3, 23)      |                            [META] indent:
[41](1, 3, 23)      |                            from_expression_element:
[41](1, 3, 23)      |                                table_expression:
[41](1, 3, 23)      |                                    table_reference:
[41](1, 3, 23)      |                                        identifier:           'bar'
[44](1, 3, 26)      |                                        dot:                  '.'
[45](1, 3, 27)      |                                        identifier:           'bar'
[48](1, 3, 30)      |                            [META] dedent:
[48](1, 3, 30)      |                newline:                                      '\n'
[49](1, 4, 1)       |                whitespace:                                   '    '
[53](1, 4, 5)       |                [META] dedent:
[53](1, 4, 5)       |                end_bracket:                                  ')'
[54](1, 4, 6)       |            comma:                                            ','
[55](1, 4, 7)       |            newline:                                          '\n'
[56](1, 5, 1)       |            whitespace:                                       '    '
[60](1, 5, 5)       |            common_table_expression:
[60](1, 5, 5)       |                identifier:                                   'baz'
[63](1, 5, 8)       |                whitespace:                                   ' '
[64](1, 5, 9)       |                keyword:                                      'AS'
[66](1, 5, 11)      |                whitespace:                                   ' '
[67](1, 5, 12)      |                start_bracket:                                '('
[68](1, 5, 13)      |                [META] indent:
[68](1, 5, 13)      |                newline:                                      '\n'
[69](1, 6, 1)       |                whitespace:                                   '        '
[77](1, 6, 9)       |                select_statement:
[77](1, 6, 9)       |                    select_clause:
[77](1, 6, 9)       |                        keyword:                              'SELECT'
[83](1, 6, 15)      |                        [META] indent:
[83](1, 6, 15)      |                        whitespace:                           ' '
[84](1, 6, 16)      |                        select_clause_element:
[84](1, 6, 16)      |                            wildcard_expression:
[84](1, 6, 16)      |                                wildcard_identifier:
[84](1, 6, 16)      |                                    star:                     '*'
[85](1, 6, 17)      |                    whitespace:                               ' '
[86](1, 6, 18)      |                    [META] dedent:
[86](1, 6, 18)      |                    from_clause:
[86](1, 6, 18)      |                        keyword:                              'FROM'
[90](1, 6, 22)      |                        whitespace:                           ' '
[91](1, 6, 23)      |                        from_expression:
[91](1, 6, 23)      |                            [META] indent:
[91](1, 6, 23)      |                            from_expression_element:
[91](1, 6, 23)      |                                table_expression:
[91](1, 6, 23)      |                                    table_reference:
[91](1, 6, 23)      |                                        identifier:           'bap'
[94](1, 6, 26)      |                            [META] dedent:
[94](1, 6, 26)      |                newline:                                      '\n'
[95](1, 7, 1)       |                whitespace:                                   '    '
[99](1, 7, 5)       |                [META] dedent:
[99](1, 7, 5)       |                end_bracket:                                  ')'
[100](1, 7, 6)      |            newline:                                          '\n'
[101](1, 8, 1)      |            whitespace:                                       '    '
[105](1, 8, 5)      |            select_statement:
[105](1, 8, 5)      |                select_clause:
[105](1, 8, 5)      |                    keyword:                                  'SELECT'
[111](1, 8, 11)     |                    [META] indent:
[111](1, 8, 11)     |                    whitespace:                               ' '
[112](1, 8, 12)     |                    newline:                                  '\n'
[113](1, 9, 1)      |                    whitespace:                               '        '
[121](1, 9, 9)      |                    select_clause_element:
[121](1, 9, 9)      |                        wildcard_expression:
[121](1, 9, 9)      |                            wildcard_identifier:
[121](1, 9, 9)      |                                star:                         '*'
[122](1, 9, 10)     |                whitespace:                                   ' '
[123](1, 9, 11)     |                newline:                                      '\n'
[124](1, 10, 1)     |                whitespace:                                   '    '
[128](1, 10, 5)     |                [META] dedent:
[128](1, 10, 5)     |                from_clause:
[128](1, 10, 5)     |                    keyword:                                  'FROM'
[132](1, 10, 9)     |                    whitespace:                               ' '
[133](1, 10, 10)    |                    newline:                                  '\n'
[134](1, 11, 1)     |                    whitespace:                               '        '
[142](1, 11, 9)     |                    from_expression:
[142](1, 11, 9)     |                        [META] indent:
[142](1, 11, 9)     |                        from_expression_element:
[142](1, 11, 9)     |                            table_expression:
[142](1, 11, 9)     |                                table_reference:
[142](1, 11, 9)     |                                    identifier:               'foo'
[145](1, 11, 12)    |                        newline:                              '\n'
[146](1, 12, 1)     |                        whitespace:                           '    '
[150](1, 12, 5)     |                        [META] dedent:
[150](1, 12, 5)     |                        join_clause:
[150](1, 12, 5)     |                            keyword:                          'INNER'
[155](1, 12, 10)    |                            whitespace:                       ' '
[156](1, 12, 11)    |                            keyword:                          'JOIN'
[160](1, 12, 15)    |                            [META] indent:
[160](1, 12, 15)    |                            whitespace:                       ' '
[161](1, 12, 16)    |                            newline:                          '\n'
[162](1, 13, 1)     |                            whitespace:                       '        '
[170](1, 13, 9)     |                            from_expression_element:
[170](1, 13, 9)     |                                table_expression:
[170](1, 13, 9)     |                                    table_reference:
[170](1, 13, 9)     |                                        identifier:           'baz'
[173](1, 13, 12)    |                            whitespace:                       ' '
[174](1, 13, 13)    |                            newline:                          '\n'
[175](1, 14, 1)     |                            whitespace:                       '    '
[179](1, 14, 5)     |                            keyword:                          'USING'
[184](1, 14, 10)    |                            [META] indent:
[184](1, 14, 10)    |                            whitespace:                       ' '
[185](1, 14, 11)    |                            start_bracket:                    '('
[186](1, 14, 12)    |                            [META] indent:
[186](1, 14, 12)    |                            newline:                          '\n'
[187](1, 15, 1)     |                            whitespace:                       '        '
[195](1, 15, 9)     |                            identifier:                       'user_id'
[202](1, 15, 16)    |                            newline:                          '\n'
[203](1, 16, 1)     |                            whitespace:                       '    '
[207](1, 16, 5)     |                            [META] dedent:
[207](1, 16, 5)     |                            end_bracket:                      ')'
[208](1, 16, 6)     |                            [META] dedent:
[208](1, 16, 6)     |                            [META] dedent:
[208](1, 16, 6)     |                        newline:                              '\n'
[209](1, 17, 1)     |                        whitespace:                           '    '
[213](1, 17, 5)     |                        join_clause:
[213](1, 17, 5)     |                            keyword:                          'INNER'
[218](1, 17, 10)    |                            whitespace:                       ' '
[219](1, 17, 11)    |                            keyword:                          'JOIN'
[223](1, 17, 15)    |                            [META] indent:
[223](1, 17, 15)    |                            whitespace:                       ' '
[224](1, 17, 16)    |                            newline:                          '\n'
[225](1, 18, 1)     |                            whitespace:                       '        '
[233](1, 18, 9)     |                            from_expression_element:
[233](1, 18, 9)     |                                table_expression:
[233](1, 18, 9)     |                                    table_reference:
[233](1, 18, 9)     |                                        identifier:           'ban'
[236](1, 18, 12)    |                            whitespace:                       ' '
[237](1, 18, 13)    |                            newline:                          '\n'
[238](1, 19, 1)     |                            whitespace:                       '    '
[242](1, 19, 5)     |                            keyword:                          'USING'
[247](1, 19, 10)    |                            [META] indent:
[247](1, 19, 10)    |                            whitespace:                       ' '
[248](1, 19, 11)    |                            start_bracket:                    '('
[249](1, 19, 12)    |                            [META] indent:
[249](1, 19, 12)    |                            newline:                          '\n'
[250](1, 20, 1)     |                            whitespace:                       '        '
[258](1, 20, 9)     |                            identifier:                       'user_id'
[265](1, 20, 16)    |                            newline:                          '\n'
[266](1, 21, 1)     |                            whitespace:                       '    '
[270](1, 21, 5)     |                            [META] dedent:
[270](1, 21, 5)     |                            end_bracket:                      ')'
[271](1, 21, 6)     |                            [META] dedent:
[271](1, 21, 6)     |                            [META] dedent:
[271](1, 21, 6)     |    newline:                                                  '\n'
[272](1, 22, 1)     |    whitespace:                                               '    '

In [32]:
import json
In [35]:
print(json.dumps(parsed.tree.to_tuple(), indent=4))
                                             ]
                                                            ]
                                                        ]
                                                    ]
                                                ]
                                            ]
                                        ]
                                    ]
                                ],
                                [
                                    "newline",
                                    []
                                ],
                                [
                                    "whitespace",
                                    []
                                ],
                                [
                                    "end_bracket",
                                    []
                                ]
                            ]
                        ],
                        [
                            "newline",
                            []
                        ],
                        [
                            "whitespace",
                            []
                        ],
                        [
                            "select_statement",
                            [
                                [
                                    "select_clause",
                                    [
                                        [
                                            "keyword",
                                            []
                                        ],
                                        [
                                            "whitespace",
                                            []
                                        ],
                                        [
                                            "newline",
                                            []
                                        ],
                                        [
                                            "whitespace",
                                            []
                                        ],
                                        [
                                            "select_clause_element",
                                            [
                                                [
                                                    "wildcard_expression",
                                                    [
                                                        [
                                                            "wildcard_identifier",
                                                            [
                                                                [
                                                                    "star",
                                                                    []
                                                                ]
                                                            ]
                                                        ]
                                                    ]
                                                ]
                                            ]
                                        ]
                                    ]
                                ],
                                [
                                    "whitespace",
                                    []
                                ],
                                [
                                    "newline",
                                    []
                                ],
                                [
                                    "whitespace",
                                    []
                                ],
                                [
                                    "from_clause",
                                    [
                                        [
                                            "keyword",
                                            []
                                        ],
                                        [
                                            "whitespace",
                                            []
                                        ],
                                        [
                                            "newline",
                                            []
                                        ],
                                        [
                                            "whitespace",
                                            []
                                        ],
                                        [
                                            "from_expression",
                                            [
                                                [
                                                    "from_expression_element",
                                                    [
                                                        [
                                                            "table_expression",
                                                            [
                                                                [
                                                                    "table_reference",
                                                                    [
                                                                        [
                                                                            "identifier",
                                                                            []
                                                                        ]
                                                                    ]
                                                                ]
                                                            ]
                                                        ]
                                                    ]
                                                ],
                                                [
                                                    "newline",
                                                    []
                                                ],
                                                [
                                                    "whitespace",
                                                    []
                                                ],
                                                [
                                                    "join_clause",
                                                    [
                                                        [
                                                            "keyword",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "keyword",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "newline",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "from_expression_element",
                                                            [
                                                                [
                                                                    "table_expression",
                                                                    [
                                                                        [
                                                                            "table_reference",
                                                                            [
                                                                                [
                                                                                    "identifier",
                                                                                    []
                                                                                ]
                                                                            ]
                                                                        ]
                                                                    ]
                                                                ]
                                                            ]
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "newline",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "keyword",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "start_bracket",
                                                            []
                                                        ],
                                                        [
                                                            "newline",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "identifier",
                                                            []
                                                        ],
                                                        [
                                                            "newline",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "end_bracket",
                                                            []
                                                        ]
                                                    ]
                                                ],
                                                [
                                                    "newline",
                                                    []
                                                ],
                                                [
                                                    "whitespace",
                                                    []
                                                ],
                                                [
                                                    "join_clause",
                                                    [
                                                        [
                                                            "keyword",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "keyword",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "newline",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "from_expression_element",
                                                            [
                                                                [
                                                                    "table_expression",
                                                                    [
                                                                        [
                                                                            "table_reference",
                                                                            [
                                                                                [
                                                                                    "identifier",
                                                                                    []
                                                                                ]
                                                                            ]
                                                                        ]
                                                                    ]
                                                                ]
                                                            ]
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "newline",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "keyword",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "start_bracket",
                                                            []
                                                        ],
                                                        [
                                                            "newline",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "identifier",
                                                            []
                                                        ],
                                                        [
                                                            "newline",
                                                            []
                                                        ],
                                                        [
                                                            "whitespace",
                                                            []
                                                        ],
                                                        [
                                                            "end_bracket",
                                                            []
                                                        ]
                                                    ]
                                                ]
                                            ]
                                        ]
                                    ]
                                ]
                            ]
                        ]
                    ]
                ]
            ]
        ],
        [
            "newline",
            []
        ],
        [
            "whitespace",
            []
        ]
    ]
]
In [31]:
parsed.tree.to_tuple()
Out[31]:
('file',
 (('newline', ()),
  ('whitespace', ()),
  ('statement',
   (('with_compound_statement',
     (('keyword', ()),
      ('whitespace', ()),
      ('common_table_expression',
       (('identifier', ()),
        ('whitespace', ()),
        ('keyword', ()),
        ('whitespace', ()),
        ('start_bracket', ()),
        ('newline', ()),
        ('whitespace', ()),
        ('select_statement',
         (('select_clause',
           (('keyword', ()),
            ('whitespace', ()),
            ('select_clause_element',
             (('wildcard_expression',
               (('wildcard_identifier', (('star', ()),)),)),)))),
          ('whitespace', ()),
          ('from_clause',
           (('keyword', ()),
            ('whitespace', ()),
            ('from_expression',
             (('from_expression_element',
               (('table_expression',
                 (('table_reference',
                   (('identifier', ()),
                    ('dot', ()),
                    ('identifier', ()))),)),)),)))))),
        ('newline', ()),
        ('whitespace', ()),
        ('end_bracket', ()))),
      ('comma', ()),
      ('newline', ()),
      ('whitespace', ()),
      ('common_table_expression',
       (('identifier', ()),
        ('whitespace', ()),
        ('keyword', ()),
        ('whitespace', ()),
        ('start_bracket', ()),
        ('newline', ()),
        ('whitespace', ()),
        ('select_statement',
         (('select_clause',
           (('keyword', ()),
            ('whitespace', ()),
            ('select_clause_element',
             (('wildcard_expression',
               (('wildcard_identifier', (('star', ()),)),)),)))),
          ('whitespace', ()),
          ('from_clause',
           (('keyword', ()),
            ('whitespace', ()),
            ('from_expression',
             (('from_expression_element',
               (('table_expression',
                 (('table_reference', (('identifier', ()),)),)),)),)))))),
        ('newline', ()),
        ('whitespace', ()),
        ('end_bracket', ()))),
      ('newline', ()),
      ('whitespace', ()),
      ('select_statement',
       (('select_clause',
         (('keyword', ()),
          ('whitespace', ()),
          ('newline', ()),
          ('whitespace', ()),
          ('select_clause_element',
           (('wildcard_expression',
             (('wildcard_identifier', (('star', ()),)),)),)))),
        ('whitespace', ()),
        ('newline', ()),
        ('whitespace', ()),
        ('from_clause',
         (('keyword', ()),
          ('whitespace', ()),
          ('newline', ()),
          ('whitespace', ()),
          ('from_expression',
           (('from_expression_element',
             (('table_expression',
               (('table_reference', (('identifier', ()),)),)),)),
            ('newline', ()),
            ('whitespace', ()),
            ('join_clause',
             (('keyword', ()),
              ('whitespace', ()),
              ('keyword', ()),
              ('whitespace', ()),
              ('newline', ()),
              ('whitespace', ()),
              ('from_expression_element',
               (('table_expression',
                 (('table_reference', (('identifier', ()),)),)),)),
              ('whitespace', ()),
              ('newline', ()),
              ('whitespace', ()),
              ('keyword', ()),
              ('whitespace', ()),
              ('start_bracket', ()),
              ('newline', ()),
              ('whitespace', ()),
              ('identifier', ()),
              ('newline', ()),
              ('whitespace', ()),
              ('end_bracket', ()))),
            ('newline', ()),
            ('whitespace', ()),
            ('join_clause',
             (('keyword', ()),
              ('whitespace', ()),
              ('keyword', ()),
              ('whitespace', ()),
              ('newline', ()),
              ('whitespace', ()),
              ('from_expression_element',
               (('table_expression',
                 (('table_reference', (('identifier', ()),)),)),)),
              ('whitespace', ()),
              ('newline', ()),
              ('whitespace', ()),
              ('keyword', ()),
              ('whitespace', ()),
              ('start_bracket', ()),
              ('newline', ()),
              ('whitespace', ()),
              ('identifier', ()),
              ('newline', ()),
              ('whitespace', ()),
              ('end_bracket', ()))))))))))),)),
  ('newline', ()),
  ('whitespace', ())))

Extract table names. SQLfluff looks for all table references which are NOT CTE aliases.

In [5]:
parsed.tree.get_table_references()
Out[5]:
{'ban', 'bap', 'bar.bar'}
In [ ]:

Comments