Исправить Python скрипт, который написал chatgpt.

12.06.2024 18:14

Я написал скрипт в ChatGPT , который берет из документа Google Sheets текст и превращает его в JSON. Все работает, но есть проблема Проблема в том что если в тексте есть код, который заключен между тегами [code] [/code], то он не попадает в JSON. И ссылки тоже не попадают. Ссылки указаны тегами [url] [/url] Пример текста, который нужно обработать: "Ученик: Привет, я совсем новичок в программировании и выбираю пункт 1. Учитель: Отлично, давай начнём с азов! Сначала мы займёмся основами HTML и CSS. Они как кирпичи и цемент в строительстве веб-страниц. HTML помогает строить структуру, а CSS добавляет стиль. Ты готов создать свою первую веб-страницу? Ученик: Да, супер, готов начать! Учитель: Отлично! Начнем с HTML. Открытый тег, закрывающий тег, элементы, атрибуты — знаешь, что это такое? Учениk: Нет, ничего не знаю. Учитель: Нет проблем! В HTML элементы составляют основу веб-страницы. Любой элемент обрамляется открытым [code][/code] и закрывающим [code][/code] тегами. Между этими тегами может быть текст или другие теги. Например, [code]

Это абзац.

[/code]. Ученик: Понял, а что такое атрибуты? Учитель: Атрибуты предоставляют дополнительную информацию о теге. Они всегда находятся в открывающем теге и состоят из пары ключ-значение. Вот так: [code][/code]. Например, [code][/code]. Здесь `src` — это атрибут, который показывает путь к файлу изображения, а `alt` — текст альтернативного описания. Ученик: О, теперь ясно! А что дальше? Учитель: Теперь попробуем создать простую HTML-страницу. [url]ihttps://yodo.im[/url] Напиши следующий код: [code] Моя первая страница

Привет мир!

Я учу HTML!

[/code] Этот код создаёт простую страницу с заголовком ""Моя первая страница"". В теле страницы есть заголовок первого уровня и абзац текста. Ученик: Всё отлично работает! Что такое CSS? Учитель: CSS, или каскадные таблицы стилей, используются для визуального оформления HTML-страницы. Если HTML - это скелет, то CSS - это одежда. Давай добавим немного стилей. Создай файл `style.css` и добавь следующий код: [code] body { font-family: Arial, sans-serif; background-color: #f4f4f4; margin: 40px; } h1 { color: navy; } p { color: darkgreen; } [/code] А теперь свяжи этот CSS-файл с HTML, добавив следующую строку в секцию `head` твоего HTML файла: [code] [/code] Это подключит стили CSS к твоей веб-странице и изменит внешний вид элементов. Ученик: И это тоже заработало! Теперь все выглядит красочнее. Учитель: Великолепно! Как видишь, основы не так сложны. Есть вопросы по тому, что мы разобрали? " скрипт вот root@Ubuntu-2204-jammy-amd64-base ~/sh/course_dev # cat 2_yodo_json_generator.py import json import re import uuid import gspread from oauth2client.service_account import ServiceAccountCredentials from datetime import datetime import boto3 def generate_unique_id(): return str(uuid.uuid4()) def parse_dialog(config): # Авторизация и доступ к Google Sheets credentials_file = config['google_sheets']['credentials_file'] sheet_id = config['google_sheets']['sheet_id'] scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive'] creds = ServiceAccountCredentials.from_json_keyfile_name(credentials_file, scope) client = gspread.authorize(creds) sheet = client.open_by_key(sheet_id).sheet1 records = sheet.get_all_records() for row in records: if row['Status'] == 'Урок создан': dialog_text = row['Текст урока'] dialog_lines = dialog_text.split('\n') blocks = [] edges = [] groups = [] current_group_id = generate_unique_id() previous_block_id = None group_counter = 1 inside_code_block = False code_block_content = [] for line in dialog_lines: line = line.strip() if line.startswith('Учитель:'): text = line[len('Учитель:'):].strip() text_blocks = re.split(r'(\[code\]|\[/code\]|\[url\].*?\[/url\])', text) for block in text_blocks: if block == '[code]': inside_code_block = True code_block_content = [] elif block == '[/code]': inside_code_block = False code_content = '\n'.join(code_block_content).strip() block_id = generate_unique_id() blocks.append({ "id": block_id, "type": "text", "content": { "richText": [ { "type": "p", "children": [ {"text": code_content, "bold": True} ] } ] } }) if previous_block_id: edges.append({ "id": generate_unique_id(), "from": {"blockId": previous_block_id, "groupId": current_group_id}, "to": {"blockId": block_id, "groupId": current_group_id} }) previous_block_id = block_id elif inside_code_block: code_block_content.append(block) elif block.startswith('[url]') and block.endswith('[/url]'): url_content = block[len('[url]'): -len('[/url]')].strip() block_id = generate_unique_id() blocks.append({ "id": block_id, "type": "text", "content": { "richText": [ { "type": "p", "children": [ {"type": "a", "url": url_content, "children": [{"text": url_content}]} ] } ] } }) if previous_block_id: edges.append({ "id": generate_unique_id(), "from": {"blockId": previous_block_id, "groupId": current_group_id}, "to": {"blockId": block_id, "groupId": current_group_id} }) previous_block_id = block_id else: sentences = re.split(r'(\[url\].*?\[/url\])', block) for sentence in sentences: sentence_block_id = generate_unique_id() if sentence.startswith('[url]') and sentence.endswith('[/url]'): url_content = sentence[len('[url]'): -len('[/url]')].strip() blocks.append({ "id": sentence_block_id, "type": "text", "content": { "richText": [ { "type": "p", "children": [ {"type": "a", "url": url_content, "children": [{"text": url_content}]} ] } ] } }) if previous_block_id: edges.append({ "id": generate_unique_id(), "from": {"blockId": previous_block_id, "groupId": current_group_id}, "to": {"blockId": sentence_block_id, "groupId": current_group_id} }) previous_block_id = sentence_block_id else: blocks.append({ "id": sentence_block_id, "type": "text", "content": { "richText": [ { "type": "p", "children": [ {"text": sentence} ] } ] } }) if previous_block_id: edges.append({ "id": generate_unique_id(), "from": {"blockId": previous_block_id, "groupId": current_group_id}, "to": {"blockId": sentence_block_id, "groupId": current_group_id} }) previous_block_id = sentence_block_id elif line.startswith('Ученик:'): text = line[len('Ученик:'):].strip() choices = text.split('|') choice_items = [] for choice in choices: choice_id = generate_unique_id() choice_content = choice.strip() outgoing_edge_id = generate_unique_id() choice_items.append({ "id": choice_id, "content": choice_content, "outgoingEdgeId": outgoing_edge_id }) block_id = generate_unique_id() blocks.append({ "id": block_id, "type": "choice input", "items": choice_items }) if previous_block_id: edges.append({ "id": generate_unique_id(), "from": {"blockId": previous_block_id, "groupId": current_group_id}, "to": {"blockId": block_id, "groupId": current_group_id} }) previous_block_id = block_id groups.append({ "id": current_group_id, "title": f'Group #{group_counter}', "graphCoordinates": {"x": group_counter * 150, "y": group_counter * 100}, "blocks": blocks # Указание объектов, а не строк }) current_group_id = generate_unique_id() group_counter += 1 blocks = [] if edges: json_output = { "version": "6", "id": "clwou1lln006oz3n3yqxpq250", "name": "template_bot", "events": [{"id": generate_unique_id(), "outgoingEdgeId": edges[0]["id"], "graphCoordinates": {"x": 0, "y": 0}, "type": "start"}], "groups": groups, "edges": edges, "variables": [], "theme": {}, "selectedThemeTemplateId": None, "settings": {}, "createdAt": "", "updatedAt": "", "icon": None, "folderId": None, "publicId": None, "customDomain": None, "workspaceId": None, "resultsTablePreferences": None, "isArchived": False, "isClosed": False, "whatsAppCredentialsId": None, "riskLevel": None } current_time = datetime.now().strftime('%Y%m%d%H%M') output_file = f"{generate_unique_id()}_{current_time}.json" with open(output_file, 'w', encoding='utf-8') as f: json.dump(json_output, f, ensure_ascii=False, indent=4) print(f'JSON file created: {output_file}') # Upload to Wasabi and make public wasabi_config = config['wasabi'] session = boto3.session.Session() s3_client = session.client( service_name='s3', aws_access_key_id=wasabi_config['aws_access_key_id'], aws_secret_access_key=wasabi_config['aws_secret_access_key'], endpoint_url=wasabi_config['endpoint_url'] ) bucket_name = wasabi_config['bucket_name'] try: s3_client.upload_file(output_file, bucket_name, output_file) s3_client.put_object_acl(ACL='public-read', Bucket=bucket_name, Key=output_file) public_url = f"{wasabi_config['endpoint_url']}/{bucket_name}/{output_file}" print(f'Public URL: {public_url}') # Update Google Sheet with the Wasabi URL and change status cell = sheet.find(row['Status']) sheet.update_cell(cell.row, 9, public_url) # Обновляем столбец I "JSON URL" sheet.update_cell(cell.row, 7, "Можно публиковать") # Обновляем статус except Exception as e: print(f"Failed to upload to Wasabi: {e}") else: print("No edges found. Skipping JSON file creation.") if __name__ == "__main__": config_path = "/root/sh/course_dev/yodo_config.json" with open(config_path, 'r') as f: config = json.load(f) output_file, json_output = parse_dialog(config) print(output_file, json_output)

Moscow.media

Частные объявления сегодня

Rss.plus

Все новости за 24 часа

Исправить Python скрипт, который написал chatgpt.

Привет мир!

Новости спорта

Australian Open. Расписание на вторник. Медведев сыграет в 6 утра по Москве, Касаткина и Хачанов – первым запуском, Рублев – последним

Изготовление металлоконструкций Электрогорск (Московская область)

Сестра Яниса Тиммы назвала новую вероятную причину самоубийства спортсмена

Волочкова: не выгляжу на свой «полтос» благодаря здоровому образу жизни

Четыре человека пострадали в результате ДТП на подмосковной трассе