Projekt

Obecné

Profil

Akce

Project architecture » Historie » Revize 51

« Předchozí | Revize 51/56 (rozdíl) | Další »
Alex Konig, 2021-06-11 09:47


Project architecture

The application consists of two parts

  • Server
  • Client application

Server architecture

In the following text will be specified the architecture of and the communication between parts of the server application.

Architecture overview

In the simple visualisation below are displayed classes that are relevant for more than one main package of the server and requests that take place between the main packages.
Main packages of the server are the following:
  • DataLoader
  • Parser
  • Model
  • Connection
  • WeatherPredictionParser
  • UserCommunication
Main requests that take place within this system are:
  • Administrator asks UserCommunication for downloading new data or for the retraining of model
  • Connection asks for prediction for an input from user
  • Model asks Parser for information acquired from data
  • Parser asks DataLoader for path to folder containing data
  • Model asks WeatherPredictionParser for information about current weather prediction for today/tommorrow/day after tommorrow if those data is required to fullfil a request from client

Configuration

When launching the server application a configuration file must be passed as a command line argument.

Configuration file

The configuration file must contain the following lines:

  • site for opendata
  • naming_convention
  • data_root_dir
  • port
  • site for weather prediction (optional)

These lines must be followed by lines containing the desired values.
Default setting of the config file is as follows:

site for opendata

The site configuration specifies the website where the UWB open data can be downloaded.
It is by default set to http://openstore.zcu.cz/

naming_convention

The naming convention specifies how the archives available for download are named.
It is by default set to OD_ZCU_{type}_{month}_{year}_{format}.zip
Variables in this string must keep their name, cannot be excluded and others cannot be added. They must be enclosed in {} brackets. {} characters are treated as special characters and cannot be used as a part of the name.

data_root_dir

The data root directory specifies where the downloaded data will be stored. In this root directory subdirectories are created for individual data types.
It is by default set to .\data\auto (relative to the config file's path)

port

The port specifies the port number at which the part of the server listens for clients' connections.
It is by default set to 10000

site for weather prediction

Furthermore the configuration file can contain link to the site from which the weather prediction is downloaded. If no page is specified the default site http://wttr.in/Pilsen?format=j1 is used. Used link must lead to a page with a json file in format that satisfies the format specified on page Data file structure

DataLoader architecture

The DataLoader package takes care of downloading data from the specified website, saving them to a specified directory and providing them to the Parser.

Date class

The Date class represents a date given by a month and a year. It contains overloaded operators for comparison, these operators are >, <, >=, <=, ==, != . The date is equal if both month and year match. The date is greater than other date if it is after the other date and vice versa.
This class also provides a method for increasing a month by one. This method returns a new date with the month increased by one, possibly the year increased by one and the month set to 1 if the original month was 12.

This class is used by the DataDownloader class to be passed as an argument to various methods (see the server architecture diagram).

DataDownloader class

todo: rename to DataLoader

This class takes care of data download, storing and providing it to the Parser.
The constructor of this class takes 3 arguments

  public DataDownloader(string rootDataDir, string website, string namingConvention)

The values for these arguments are found in the configuration file.

It provides public fields

  public string RootDataDirectory { get; }
  public Dictionary<DataType, string> DataSubDirectories { get; }
  public bool OverwriteExisting { get; set; }

Data download

Data is downloaded using the method

  public List<string> DownloadData(DataType type, DataFormat format, Date startDate, Date endDate)

Data type and format need to be specified (see enums in server architecture for supported types and formats). Also date range needs to be specified using the startDate and endDate arguments. This method then attempts to download all files falling within the range of this date span. It returns a list of full paths to all successfully saved data files.

Data retrieving

Saved data is retrieved using the method

  public List<string> GetData(string subDirectory, Date startDate, Date endDate)

The first argument specifies which subdirectory should be searched. Argumnets startDate and endDate specify the time range.
This method returns a list of full paths to all data files corresponding to the specified date range. If not enough files were found (meaning some months for the specified range are missing because they were not downloaded) and a file with month 0 exists in the directory for the year in question, then this file is returned as well.

UserCommunication architecture

The UserCommunication package contains a class with a method accepting user's (admin's) commands. This method runs in a separate thread from the rest of the server program. It waits for commands to be input from the command line.
The command can either be a command for retraining of the model which is passed to the Model, or a command for downloading new data files which is passed to DataLoader.

Model retraining command: "retrain"

Download command: "dwn <month>" or "dnw <month_from> <moth_to>" where <moth(_from/_to)> is an int between 1-12

Connection architecture

The connection package takes care of receiving requests from clients and sending responses. It does this using the .NET HttpListener class.
It is built upon official example code by Microsoft (https://docs.microsoft.com/en-us/dotnet/api/system.net.httplistener.begingetcontext?view=net-5.0).

Communication with client driven by rules specfied on page Server-client communication.

Model architecture

The model consists of several parts. One part of the module is responsible for extracting features from data sources and for preparing corresponding labels. These classes interact with the parser module. Another part of the module is a class with the implementation of the selected classifier. At the moment, the only classifier planned is the Naive Bayes classifier.

The expected control flow of this module is following:
  • The server handler calls for model re-training: The model receives attendance data from corresponding parsers. It labels received data based on the percentual distribution of activity in the building. The model receives corresponding weather data to the attendance information. The model also receives the identifier of the building. The module selects an existing classifier that is linked with the received building identifier. It extracts the features from weather data so that the feature vectors fit the selected classifier. Then, the created feature vectors and the created labels are used to re-train the model.
  • The server handler receives a client request for prediction -> it calls prediction from the model module: The model selects a classifier corresponding to the requested building, receives current weather info and extracts features from that. Then, it predicts a possible attendance label based on the features.

Interface model-parser

Model can request parsing new data files. This request is done by calling the method Parse() from the class DataParser.

  public bool Parse(DateTime startTime, DateTime endTime, int interval = 1, bool wholeDay = true)

Model specifies the time period in which it is interested (using parameters startTime and endTime, allowing to specify dd:mm:yyyy-dd:mm:yyyy), whether it wants to aggregate data from one day into one information piece (setting parameter wholeDay to true) or into how long intervals (in hours) it wants to divide the days (setting parameters wholeDay to false and interval to the number of hours).

For example if the request is done with parameters wholeDay set as false and intervalLength set as 3, days will be divided with a 3h interval. For each day are created entries for the following times:

  • 7-10h
  • 10-13h
  • 13-16h
  • 16-19h

If request is done with parameter wholeDay set as true, for each day is created only one entry for all events between 7am to 19pm.

The parsed information is afterwards stored in attributes of DataParser class: WeatherList and AttendanceList. WeatherList contains weather information obtained from data files, and AttendanceList contains the information about the amount of activity (jis activations plus webAuth data) that took place.

Parser architecture

Parser part of the server is responsible for reading and parsing data from separate files and aggregating data in a way that was requested by model. It expects input in format specified in Data file structure and outputs a relatively universal set of information.

However both output and input are dependant on specific tags used in data. If the only subject of change were these tags, then the only class that needs changing would be TagInfo. If the input data file format was changed then the class CsvDataLoader would need to be changed. If there would be different data input than jis and webauth activity then package InputInfo and Parsers would need to change. Output classes are written to be general (as general weather informationa and activity information), however if there were big changes in input or output specification (for instance new added weather input - fog) it would be better to rewrite (or accordingly modify) this whole module. As long as the interface of DataParser is respected. There is a risk that some changes might interfere with Model too because the model is to a degree dependant on given information derived from, as it extracts symptoms from this information, and we cannot predict which extra symptoms could be added.

Important classes (some of which were already mentioned above) to note are:

CsvLoader

Class responsible for loading input data files into memory. Can be swapped for a class processing different types of files as long as it provides the same methods.

Methods

List<JisInstance> LoadJisFile(string pathToFile)

Method that loads jis file into memory and returns each line translated into an instance of class JisInstance.

List<LogInInstance> LoadLoginFile(string pathToFile)

Method that loads computer login file into memory and returns each line translated into an instance of class LogInInstance.

List<WeatherInstance> LoadWeatherFile(string pathToFile)

Method that loads weather file into memory and returns each line translated into an instance of class WeatherInstance.

DataParser

Class responsible for parsing the input data into information. Can be swapped for a class processing different input files as long as it provides the same methods.

Attributes

Important attributes this class has to provide are the following:

  • List<WeatherInfo> WeatherList - list of WeatherInfo representing overall weather
  • List<ActivityInfo> AttendanceList - list of ActivityInfo repersenting overall activity
  • List<string> WeatherDataUsed - list of weather file names the parser was last used on
  • List<string> ActivityDataUsed - list of activity file names the parser was last used on

Methods

Important methods this class has to provide are the following:

bool Parse(DateTime startTime, DateTime endTime, int interval = 1, bool wholeDay = true)
Parameters:
  • DateTime startTime - start time of the time window we're interested in
  • DateTime endTime - end time of the time window we're interested in
  • int interval - by how many hours should be data parsed (not taken into account if wholeDay is false)
  • bool wholeDay - true if data should be parsed as days, false if by intervals

Returns true if successful, false if not.

This method has to fill WeatherList, AttendanceList, WeatherDataUsed and ActivityDataUsed with current information. It uses separate data parsers for Jis, Computer and Weather data (JisParser, LoginParser and WeatherParser, as seen in UML below) which use DataDownloader to download source files, use CsvDataLoader for loading said files into memory, and for parsing them into input data, and then parse them into output information.

TagInfo

Tags specified in this class correspond to the tables with buildings and locations written down in Data sources.

WeatherPredictionParser architecture

This part of the server application is responsible for downloading new information about current weather predictions. It is created to work with the following data source http://wttr.in/?format=j1

Specific place can be specified through a config file mentioned in a chapter above.

Interface Parser-DataLoader

Parser requests path to folder with data files. Further it can request from DataLoader to filter through data file names and return only those that are from a specified time period (mm:yyyy-mm:yyyy).

Client application architecture

The client application is a Unity application, therefore creating a UML diagram could prove to be misleading as most classes are scripts attached to objects in the scene.

Two client applications exist - an Android client and a WebGL browser client app. Unity WebGL applications are not, generally, supported on mobile platforms. The Unity project is organized in two scenes - Android and WebGL, to be built under their respective platforms. The central component of each scene is the Unity Canvas, set to scale with screen size. Within the hierarchy of the Canvas, customized UI components are used to create the interface, along with a layered map.

Unity version 2019.4.20f1 (LTS) was used during development.

Android client

The minimum supported Android version is KitKat (4.4).

WebGL client

The client has been tested in the following browsers:

  • Vivaldi (3.6.2165.40)
  • Chrome (90.0.4430.93)
  • Mozilla Firefox (80.0)

WebGL and Android client design history is available on the page Client application design.

The WebGL application uses a template container site available at https://github.com/greggman/better-unity-webgl-template by user greggman. (Licensed as CC0)

Client-Server communication

Client-server communication is described on a separate page Server-client communication

Aktualizováno uživatelem Alex Konig před více než 3 roky(ů) · 51 revizí