Revize 29 - Historie - Project architecture - OpenData vlastní téma (KIV) - Tři Mušketýři - Redmine

Project architecture » Historie » Revize 29

Revize 28 (Alex Konig, 2021-04-30 10:24) → Revize 29/56 (Alex Konig, 2021-04-30 10:29)

h1. Project architecture 

 The application consists of two parts 

 * Server 
 * Client application 


 h1. Server architecture 

 In the following text will be specified the architecture of and the communication between parts of the server application. 

 In the simple visualisation below are displayed classes that are relevant for more than one main package of the server and requests that take place between the main packages. 
 Main packages of the server are the following:  
 * DataLoader 
 * Parser 
 * Model 
 * Connection 
 * WeatherPredictionParser 
 * User 

 Main requests that take place within this system are: 
 * Administrator asks User for downloading new data or for the retraining of model 
 * Connection asks for prediction for an input from user / requests retraining of the model 
 * Model asks Parser for information acquired from data 
 * Parser asks DataLoader for path to folder containing data 
 * Model asks WeatherPredictionParser for information about current weather prediction for today/tommorrow/day after tommorrow if those data is required to fullfil a request from client 

 h3. Configuration 

 When launching the server application a configuration file must be passed as a command line argument. 

 h4. Configuration file 

 A configuration file must be passed as an argument to the server application when launching it. The file must contain the following lines: 

 * site for opendata 
 * naming_convention 
 * data_root_dir 
 * port 
 * site for weather prediction (optional) 

 These lines must be followed by lines containing the desired values.  
 Default setting of the config file is as follows: 

 h5. site for opendata 

 The site configuration specifies the website where the UWB open data can be downloaded. 
 It is by default set to *http://openstore.zcu.cz/* 

 h5. naming_convention 

 The naming convention specifies how the archives available for download are named. 
 It is by default set to *OD_ZCU_{type}_{month}_{year}_{format}.zip* 
 Variables in this string must keep their name, cannot be excluded and others cannot be added. They must be enclosed in {} brackets. 
 {} characters are treated as special characters and cannot be used as a part of the name. 


 h5. data_root_dir 

 The data root directory specifies where the downloaded data will be stored. In this root directory subdirectories are created for individual data types. 
 It is by default set to *.\data\auto* (relative to the config file's path) 

 h5. port 

 The port specifies the port number at which the part of the server listens for clients' connections. 
 It is by default set to *10000* 

 h5. site for weather prediction 

 Furthermore the configuration file can contain link to the site from which the weather prediction is downloaded. If no page is specified the default site http://wttr.in/Pilsen?format=j1 is used. Used link must lead to a page with a json file in format that satisfies the format specified on page [[Data file structure]] 

 h2. Architecture overview 

 !basic_architecture_v6.png! 

 h2. DataLoader architecture 

 The DataLoader package takes care of downloading data from the specified website, saving them to a specified directory and providing them to the Parser. 

 h3. Date class 

 The Date class represents a date given by a month and a year. It contains overloaded operators for comparison, these operators are >, <, >=, <=, ==, != . The date is equal if both month and year match. The date is greater than other date if it is after the other date and vice versa. 
 This class also provides a method for increasing a month by one. This method returns a new date with the month increased by one, possibly the year increased by one and the month set to 1 if the original month was 12. 

 This class is used by the DataDownloader class to be passed as an argument to various methods (see the server architecture diagram).  

 h3. DataDownloader class 

 todo: rename to DataLoader 

 This class takes care of data download, storing and providing it to the Parser. 
 The constructor of this class takes 3 arguments  
 <pre><code class="java"> 
   public DataDownloader(string rootDataDir, string website, string namingConvention) 
 </code></pre> 

 The values for these arguments are found in the configuration file. 

 It provides public fields 
 <pre><code class="java"> 
   public string RootDataDirectory { get; } 
   public Dictionary<DataType, string> DataSubDirectories { get; } 
   public bool OverwriteExisting { get; set; } 
 </code></pre> 


 h4. Data download 

 Data is downloaded using the method  

 <pre><code class="java"> 
   public List<string> DownloadData(DataType type, DataFormat format, Date startDate, Date endDate) 
 </code></pre> 

 Data type and format need to be specified (see enums in server architecture for supported types and formats). Also date range needs to be specified using the startDate and endDate arguments. This method then attempts to download all files falling within the range of this date span. It returns a list of full paths to all successfully saved data files. 


 h4. Data retrieving 

 Saved data is retrieved using the method 

 <pre><code class="java"> 
   public List<string> GetData(string subDirectory, Date startDate, Date endDate) 
 </code></pre> 

 The first argument specifies which subdirectory should be searched. Argumnets startDate and endDate specify the time range. 
 This method returns a list of full paths to all data files corresponding to the specified date range. If not enough files were found (meaning some months for the specified range are missing because they were not downloaded) and a file with month 0 exists in the directory for the year in question, then this file is returned as well. 

 h2. User architecture 

 The User package contains a class with a method accepting user's (admin's) commands. This method runs in a separate thread from the rest of the server program. It waits for commands to be input from the command line.  

 h2. Connection architecture 

 The Connection package takes care of receiving connection requests from clients.  
 The server is built with an asynchronous socket, so execution of the server application is not suspended while it waits for a connection from a client. 
 See https://docs.microsoft.com/en-us/dotnet/framework/network-programming/asynchronous-server-socket-example for details of used classes.  

 Communication with client driven by rules specfied on page [[Server-client communication]]. 

 h2. Model architecture 

 Naive Bayes Classifier, more of them, combine input informations based on ? 

 h2. Interface model-parser 

 Model can request parsing new data files. This request is done by calling the method Parse() from the class DataParser. Model specifies the time period in which it is interested (dd:mm:yyyy-dd:mm:yyyy), whether it wants to aggregate data from one day into one information piece or into how long intervals (in hours) it wants to divide the days. The information is after parsing stored into attributes WeatherList and AttendanceList of class DataParser. 

 WeatherList contains weather information obtained from data files, and AttendanceList contains the information about the amount of activity (jis activations and webAuth data) that took place. 

 For example if the request is done with parameters wholeDay set as false and intervalLength set as 3, days will be divided with a 3h interval. For each day are created entries for the following times: 

 * 7-10h 
 * 10-13h 
 * 13-16h 
 * 16-19h 

 h2. Parser architecture 

 Parser part of the server is responsible for reading and parsing data from separate files and aggregating data in a way that was requested by model. It expects input in format specified in [[Data file structure]] formats]] and outputs a relatively universal set of information. 

 However both output and input are dependant on specific tags used in data. If the only subject of change were these tags, then the only class that needs changing would be TagInfo. If the input data file format was changed then the class CsvDataLoader would need to be changed. If there would be different data input than jis and webauth activity then package InputInfo and Parsers would need to change. Output classes are written to be general (as general weather informationa and activity information), however if there were big changes in input or output specification (for instance new added weather input - fog) it would be better to rewrite (or accordingly modify) this whole module. As long as the interface of DataParser is respected. There is a risk that some changes might interfere with Model too because the model is to a degree dependant on given information derived from, as it extracts symptoms from this information, and we cannot predict which extra symptoms could be added. 

 Interesting classes (some of which were already mentioned above) to note are: 

 h3. CsvLoader 

 Class responsible for loading input data files into memory. Can be swapped for a class processing different types of files as long as it provides the same methods. 

 h3. DataParser 

 Class responsible for parsing the input data into information. Can be swapped for a class processing different input files as long as it provides the same methods. 

 h3. TagInfo 

 Tags specified in this class correspond to the ones used in [[Data sources]]. 

 !parser_architecture_v2.png! 


 h2. WeatherPredictionParser architecture 

 This part of the server application is responsible for downloading new information about current weather predictions. It is created to work with the following data source http://wttr.in/?format=j1 

 Specific place can be specified through a config file mentioned in a chapter above. 


 h2. Interface Parser-DataLoader 

 Parser requests path to folder with data files. Further it can request from DataLoader to filter through data file names and return only those that are from a specified time period (mm:yyyy-mm:yyyy). 



 h1. Client application architecture 

 Unity application, UML not applicable for reasons (MonoBehaviour)

Projekt

Obecné

Profil

ASWI - Pokročilé softwarové inženýrství » ASWI 2021 » OpenData vlastní téma (KIV) - Tři Mušketýři

Project architecture » Historie » Revize 29