By using this site, you agree to have cookies stored on your device, strictly for functional purposes, such as storing your session and preferences.

Dismiss

Fix markdown list parsing

roundabout,
created on Sunday, 31 March 2024, 12:20:34 (1711887634), received on Wednesday, 31 July 2024, 06:54:43 (1722408883)
Author identity: vlad <vlad.muntoiu@gmail.com>

2a352ecc3f7833e397a33beb3117a63ab22e742d

markdown.py

@@ -144,9 +144,9 @@ class OrderedList(Element):

                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                    return "ol"
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        class ListItem(Paragraph):
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        class ListItem(Element):
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                def __init__(self, content):
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                super().__init__("")
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                super().__init__()
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                    self.content = tokenise(content)
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                def __repr__(self):
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        

@@ -323,14 +323,31 @@ def tokenise(source):

                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                        content = []
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                        while i < len(lines) and ((lines[i].startswith("*") or lines[i].startswith("+") or lines[i].startswith("-")) and lines[i][1:].startswith(" ")):
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                        inner_content = lines[i][2:].strip() + "\n"
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                        inner_content = lines[i][2:].strip() + "\n"      # discard marker and space
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                            i += 1
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                        while i < len(lines) and lines[i].startswith("  "):
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                            inner_content += lines[i][2:] + "\n"
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                        while i < len(lines) and lines[i].strip() and not ((lines[i].startswith("*") or lines[i].startswith("+") or lines[i].startswith("-")) and lines[i][1:].startswith(" ")):
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                            inner_content += lines[i] + "\n"
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                                i += 1
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                            content.append(ListItem(inner_content))
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                        current_block = UnorderedList(content)
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                elif re.match(r"^\d+\.", line):
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                    if not isinstance(current_block, UnorderedList):
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                        tokens.append(current_block)
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                    content = []
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                    while i < len(lines) and re.match(r"^\d+\.", line) and len(lines[i].split(".", 1)) > 1:
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                        inner_content = lines[i].split(".", 1)[1] + "\n"      # discard number and period
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                        i += 1
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                        while i < len(lines) and lines[i].strip() and not re.match(r"^\d+\.", line):
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                            inner_content += lines[i] + "\n"
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                            i += 1
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                        content.append(ListItem(inner_content))
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                    current_block = OrderedList(content)
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                    elif line.startswith("#") and leading(line.lstrip("#"), " "):
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                        tokens.append(current_block)
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        

@@ -365,7 +382,7 @@ def tokenise(source):

                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                            i += 1    # prevent a new block from beginning with the closing fence
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                        current_block = CodeBlock(content, language=language)
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                elif only_chars(lines[i+1].strip(), "=") or only_chars(lines[i+1].strip(), "-"):
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                                elif i < len(lines) - 1 and (only_chars(lines[i+1].strip(), "=") or only_chars(lines[i+1].strip(), "-")) and lines[i+1].strip():
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                        tokens.append(current_block)
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                        content = line.strip()
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        

@@ -444,10 +461,19 @@ amet.

                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            2. Test
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            3. Test
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        * Test
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        * Test
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            * Lorem
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                              ipsum
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                              * Test
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                              * Test
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                          
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        > Hello
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        > World
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        > * Test
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        > * Test
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                        > * Test
                                        
                                        
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                            """
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                )
                                        
                                        
                                            
                                            
                                            
                                            
                                        
                                    
                                
                                
                                
                            
                                
                                    
                                        
                                            
                                                # for i in ast: