## formal languages ​​- Is Python's grammar in a known category between CFG and CSG?

I understand formal languages ​​and grammars at a high level and
I know the four most important grammatical types in the Chomsky hierarchy.
I was interested in knowing the classification of Python's grammar. A quick search yielded some quick but incomplete answers.

http://trevorjim.com/python-is-not-context-free/

Is Python a context-free language?

The immediate realization of most resources is that Python's grammar is not a context-free grammar (CFG). But that does not answer the question of what it is. When I looked more closely, I realized that it is a complete context-sensitive grammar (CSG). But still no classification.

At that point, the conclusion was that there must be classes of grammars between CFGs and CSGs, but I'd never heard of it.

What I understand is that Python's Lexer (which converts strings to tokens) does something a CFG can not do: it tracks the degree of indentation and provides special INDENT DEDENT tokens. After this transformation, the resulting tokens are context-free and can be decomposed into an abstract syntax tree. Therefore, the grammar on tokens is a CFG, but the grammar on characters consumes slightly more energy than a CFG can deliver on its own. I would like to know if there is a classification for this type of grammar. What is between a CFG and a CSG?

After a bit more search I came across this table, which is at the end of a Wikpedia article:

Here is a picture of this table:

Cool, I found that there are familiar grammars between CFG and CSGs. But I'm not an expert on formal languages, so I do not know how to determine to which of these categories Python's grammar belongs.

Is it "Positive Range Concatenation", "Indexed", "Thread Automation's Grammar", "Linear Context-Free Rewriting System", "Tree adjoining"? Is not it one of them? If so, what is a classification?

Note: The full grammar specification of Python 3 can be found here: https://docs.python.org/3/reference/grammar.html

## Decidability-countability grammar Languages [on hold]

L = one language
M = a TM
Lc = complement of L
1) L = {| M is a TM encoded with an alphabet};

``````L is recursive
``````

2) L = set of all cfls generated by some cfg;

`````` => L is recursive enumerable but not recursive , Lc is not Recursive
enumerable
``````

3) L = set of all cfls that have an ambiguous grammar;

``````  => set of all cfls ( as any cfl can have a ambiguous grammar)
``````

4) L = set of all CFGs

``````=> L is recursive as by CFG production rules we can decide whether a
given  grammar is CFG or not
``````

5) The set of all regular expressions over any alphabet is a CFL
Braces and they also need to be balanced

6) L = {| M is a TM, so it's a TM M1 &! = Gives

``````=>L is decidable (recursive) as for any encoding of a TM  M we can
``````

make another M1 that accepts the same language but is different
encoding

5) L = set of all CSG

``````=>L is decidable
``````

6) L = set of all languages ​​that are CSL

``````=>L is Recursively Enumerable but Not Recursive
``````

I believe that the above statements are true
THANK YOU.

## formal languages ​​- Show that \$ L: = {(a ^ {k} b) ^ {i} | i, k epsilon mathbb {N} _ {+} } \$ is context-sensitive. (With context-sensitive / noncontracting grammar)

I am studying for an upcoming exam and this is an old exam question from two years ago (all exams have been provided by our instructor):

Show that $$L: = {(a ^ {k} b) ^ {i} | i, k epsilon mathbb {N} _ {+} }$$ is context sensitive.

I could easily construct an LBA for that language.
But since the notation / construction of LBAs was not really explained, I would have to define them in the exam.

Therefore, I assume that this task should be done by creating a context-sensitive grammar.

NOTE: In our lecture, the definition of a context-sensitive grammar was the definition of a noncontracting grammar.
So every rule like this x -> y is allowed if | x | <= | y |

So that's allowed:

``````aAb -> bXaa
``````

My best idea goes like this:

``````S  -> AB
B  -> bB | b
A  -> CA | C
Cb -> abC
Ca -> aC
``````

to generate that way `aaabaaabaaab` I do that:

``````S
AB
AbB
AbbB
Abbb
CAbbb
CCAbbb
CCCbbb
(let all C-Variables run through the word and leave an 'a' before every 'b')
CCabCbb
CCababCb
CCabababC
...
aaabaaabaaabCCC
``````

But I can not make all C variables disappear.

## Is it worth it to have Grammar Premium? [on hold]

I just want to know if it pays to have a grammar award as a grammar examiner. I want to try these apps, but I want to know if it's really worth having them.

## Context Free – Does the following grammar have a language that is inherently ambiguous?

Grammar is as follows:

$$S rightarrow aaAb | aab | A$$

$$A rightarrow aaAb | a | epsilon$$

I think that this grammar has the following unique grammar.

First, we rewrite the grammar as follows, so this grammar has the same language as the one in question:

$$S rightarrow aaSb | aSb | epsilon$$

Because this generates grammar $$L = {a ^ nb ^ m: m le n le 2m }$$

This grammar uses only two productions in a particular order to derive the string, and then uses zero production to complete the derivation.

So the idea is that we can arrange the use of production so that every use of $$S rightarrow aaSb$$ The production does not come before the use of any mold production $$S rightarrow aSb$$, And following this idea, it is very easy to generate this grammar.

$$S rightarrow aSb | A$$

$$A rightarrow aaAb | epsilon$$

,

Another argument is as follows:

because $$2k_1 + k_2 = n$$ and $$k_1 + k_2 = m$$ for some $$nonnegetive space integer space k_1, k_2$$, (Here $$k_1$$ and $$k_2$$ corresponds to the number of uses of mold production $$S rightarrow aSb$$ and $$S rightarrow aaSb$$ respectively.)

After we have solved the above equations, we find that $$k_1 = m-n$$ therefore value of $$k_1$$ will be unique for each particular string. This means that we have to use a fixed number of times to make forms $$S rightarrow aSb$$, This means that in our last rewritten grammar there is a fixed point for a particular string to play on $$S rightarrow A$$,

Please help me with this. I would like to know if this language is inherently ambiguous or not.

## Parser – Do priority and associativity in an LL grammar change the accepted language?

Parsing certainly has priority and associativity on the AST. But do precedence and associativity change the set of accepted sentences? In other words, can the order of precedence and associativity in the grammar be "ignored" and then the AST set later based on (possibly dynamic) ranking and associativity conditions?

I only deal with LL grammars and their common variants.

## Compiler – What is the purpose of the extended grammar in LR parsers?

I read the LR () parrots from the Dragon Book where I found these lines

To create a canonical LR (0) collection for a grammar, we define one
advanced grammar and 2 functions, CLOSURE and GOTO.

For example, say if we have a grammar like ->

S-> BB
B → cB / d

Then we define it in the extended grammar as

S & # 39; -> S

S-> BB

B → cB / d

The reason for this is in the Dragon Book:

The purpose of this new output state production is to make the
Parser, if he should stop analyzing and announce the acceptance of the parser
Entrance. That is, acceptance occurs only when the parser is about to do so
reduce S & # 39; -> S

I can not understand what this means and how this extended grammar is useful for parsing. Any help is greatly appreciated. Many thanks. 🙂

## Create machines from a non-regular grammar

I have two grammars:

`````` L → ε | aLcLc
L → ε | aLcLc | LL
``````

These two grammars are the same, but the first one is regular, creating a regular language and finite automata. Instead, the second is not regular, but may produce a regular language.
To prove this, I want to create two different machines: the first one should be a correct machine, and if the second can not be created, the language is not regular. Are all these statements correct?
If so, can someone help me to set up these two machines? Thanks a lot!

## How do I know if a grammar is normal or not?

I know that a regular grammar has a definition
S -> aS
S -> Lambda

But I do not really know how to use this information to check if a grammar is regular or not …

For example, I have a grammar
S -> aSbSb
S -> Lambda

If I compare it to the definition of a regular grammar, this is not a regular grammar, is it? That also means that I can not make a regex out of it.

Could you please give me an example of a regular and a non-regular grammar, which I hope will strengthen my understanding.

## regular expressions – union and difference of the languages ​​produced by the grammar

So I have two languages $$L = {w in {a, b } ^ { ast} | w text {contains an odd number of a & # 39; s} }$$ and $$L ^ { prime} = {w in {a, b } ^ { ast} | w text {contains at least two quotation marks} }$$,

What happened if $$L cup L ^ { prime}$$ and $$L setminus L ^ { prime}$$?

In my view, $$L cup L ^ { prime} = {a, b } ^ { ast} setminus {b } ^ { ast}$$ and $$L setminus L ^ { prime} = L setminus { epsilon }$$ but I am not sure.

Am I right? Thanks for your help.