I understand formal languages and grammars at a high level and
I know the four most important grammatical types in the Chomsky hierarchy.
I was interested in knowing the classification of Python's grammar. A quick search yielded some quick but incomplete answers.
Is Python a context-free language?
The immediate realization of most resources is that Python's grammar is not a context-free grammar (CFG). But that does not answer the question of what it is. When I looked more closely, I realized that it is a complete context-sensitive grammar (CSG). But still no classification.
At that point, the conclusion was that there must be classes of grammars between CFGs and CSGs, but I'd never heard of it.
What I understand is that Python's Lexer (which converts strings to tokens) does something a CFG can not do: it tracks the degree of indentation and provides special INDENT DEDENT tokens. After this transformation, the resulting tokens are context-free and can be decomposed into an abstract syntax tree. Therefore, the grammar on tokens is a CFG, but the grammar on characters consumes slightly more energy than a CFG can deliver on its own. I would like to know if there is a classification for this type of grammar. What is between a CFG and a CSG?
After a bit more search I came across this table, which is at the end of a Wikpedia article:
Here is a picture of this table:
Cool, I found that there are familiar grammars between CFG and CSGs. But I'm not an expert on formal languages, so I do not know how to determine to which of these categories Python's grammar belongs.
Is it "Positive Range Concatenation", "Indexed", "Thread Automation's Grammar", "Linear Context-Free Rewriting System", "Tree adjoining"? Is not it one of them? If so, what is a classification?
Note: The full grammar specification of Python 3 can be found here: https://docs.python.org/3/reference/grammar.html