DatasetProcessing » Historie » Revize 7
Revize 6 (Petr Hlaváč, 2020-05-27 09:02) → Revize 7/10 (Petr Hlaváč, 2020-05-27 09:03)
h1. DatasetProcessing Složka obsahuje implementace processoru pro jednotlivé datasety. Processory jsou dynamicky importovány je tedy proto nutné dodržet pojemnování *"dataset-name"_processor.py*. Připravený date_dic naplně následovně date_dict klíč -> datum ve formát YYYY-mm-dd-hh date_dict hodnota -> data_dict (další dictionary) data_dict klíč -> název zařízení data_dict hodnota -> CSVUtils.CSVDataline *při tvorbě CSVUtils.CSVDataline probíhá kontrola validity dat. při exportu dat do CSV se následně kontroluje jestli objekty jsou provadu ze třídy CSVUtils.CSVDataline !!* Po implementování metody je nutné změnit *Return None* na *Return date_dict* h2. Generovaný Processor <pre> from Utilities.CSV import csv_data_line def process_file(filename): """ Method that take path to crawled file and outputs date dictionary: Date dictionary is a dictionary where keys are dates in format ddmmYYYYhh (0804201815) and value is dictionary where keys are devices (specified in configuration file) and value is CSVDataLine.csv_data_line with device,date and occurrence Args: filename: name of processed file Returns: None if not implemented date_dict when implemented """ date_dict = dict() #with open(filename, "r") as file: print("You must implements process_file method first!") return None </pre> h2. Vzorově implementovaný processor <pre> from Utilities.CSV import csv_data_line from Utilities import date_formating def process_file(filename): """ Method that take path to crawled file and outputs date dictionary: Date dictionary is a dictionary where keys are dates in format ddmmYYYYhh (0804201815) and value is dictionary where keys are devices (specified in configuration file) and value is CSVDataLine.csv_data_line with device,date and occurrence Args: filename: name of processed file Returns: None if not implemented date_dict when implemented """ date_dict = dict() with open(filename, "r") as file: for line in file: array = line.split(";") date = date_formating.date_time_formatter(array[0][1:-1]) name = array[1][1:-1] if date not in date_dict: date_dict[date] = dict() if name in date_dict[date]: date_dict[date][name].occurrence += 1 else: date_dict[date][name] = csv_data_line.CSVDataLine(name, date, 1) return date_dict </pre>