patents.google.com

CN111124832A - Data monitoring method and device, electronic equipment and storage medium - Google Patents

  • ️Fri May 08 2020

Disclosure of Invention

In view of this, embodiments of the present invention provide a data monitoring method and apparatus, an electronic device, and a storage medium, so as to solve the problem that data quality monitoring cannot simultaneously consider table monitoring tasks of multiple users in the prior art.

Therefore, the embodiment of the invention provides the following technical scheme:

according to a first aspect, an embodiment of the present invention provides a data monitoring method, including: acquiring a user level and a task level of a user to be monitored, wherein the user level is used for representing the priority of the user to be monitored, and the task level is used for representing the priority of a monitoring task of the same user to be monitored; determining a monitoring execution sequence according to the user level and the task level; and monitoring the data quality of the data to be monitored of the user to be monitored according to the monitoring execution sequence.

Optionally, determining a monitoring execution sequence according to the user level and the task level includes: judging whether the user grades are the same; if the user grades are different, the monitoring execution sequence is that the user to be monitored with the high user grade has priority over the user to be monitored with the low user grade; and if the user grades are the same, monitoring the execution sequence in turn according to the sequence of the preset user sequence.

Optionally, determining a monitoring execution sequence according to the user level and the task level, further comprising: judging whether the task grades of the monitoring tasks of the same user to be monitored are the same; if the task grades are different, the monitoring execution sequence is that the monitoring task with the high task grade is prior to the monitoring task with the low task grade; and if the task grades are the same, monitoring the execution sequence in turn according to the sequence of the preset monitoring task sequence.

Optionally, determining a monitoring execution sequence according to the user level and the task level, further comprising: if the user level is empty, determining that the user level is the lowest user level; and/or if the task level is empty, determining that the task level is the lowest task level.

Optionally, the data quality monitoring of the data to be monitored of the user to be monitored includes: acquiring data to be monitored and a monitoring threshold range of the user to be monitored; judging whether the data quality value of the data to be monitored is within a monitoring threshold range; and if the data quality value is within the monitoring threshold range, executing the next monitoring task.

Optionally, the obtaining the data to be monitored and the monitoring threshold range of the user to be monitored includes: acquiring configuration files, wherein the configuration files comprise a data source configuration file and a monitoring configuration file; the data source configuration file comprises data source configuration information of a plurality of data source classes; the monitoring configuration file comprises monitoring configuration information generated according to different monitoring requirements; instantiating a data source class according to the data source configuration file to generate a plurality of data sources; obtaining data to be monitored of the user to be monitored according to the data source; and obtaining a monitoring threshold range according to the monitoring configuration file.

Optionally, the data source configuration information includes at least one of a data source type, a host address, a host interface, a user name, a password, and an encoding; and/or the monitoring configuration information comprises at least one of a data source name to be monitored, a monitoring table name to be monitored, a monitoring threshold range and a monitoring account.

According to a second aspect, an embodiment of the present invention provides a data monitoring apparatus, including: the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a user level and a task level of a user to be monitored, the user level is used for representing the priority of the user to be monitored, and the task level is used for representing the priority of a monitoring task of the same user to be monitored; the first processing module is used for determining a monitoring execution sequence according to the user level and the task level; and the second processing module is used for monitoring the data quality of the data to be monitored of the user to be monitored according to the monitoring execution sequence.

Optionally, the first processing module includes: the first judging unit is used for judging whether the user grades are the same or not; the first processing unit is used for prioritizing the users to be monitored with high user level over the users to be monitored with low user level in the monitoring execution sequence if the user levels are different; and the second processing unit is used for monitoring the execution sequence in turn according to the sequence of the preset user sequence if the user grades are the same.

Optionally, the first processing module further includes: the second judging unit is used for judging whether the task grades of the monitoring tasks of the same user to be monitored are the same; the third processing unit is used for prioritizing the monitoring tasks with high task level in the monitoring execution sequence over the monitoring tasks with low task level if the task levels are different; and the fourth processing unit is used for monitoring the execution sequence in turn according to the sequence of the preset monitoring task sequence if the task grades are the same.

Optionally, the first processing module further includes: a fifth processing unit, configured to determine that the user level is a lowest user level if the user level is empty; and/or the sixth processing unit is used for determining that the task level is the lowest task level if the task level is empty.

Optionally, the second processing module includes: the first acquisition unit is used for acquiring the data to be monitored and the monitoring threshold range of the user to be monitored; the third judging unit is used for judging whether the data quality value of the data to be monitored is within the monitoring threshold range; and the seventh processing unit is used for executing the next monitoring task if the data quality value is within the monitoring threshold range.

Optionally, the first obtaining unit includes: the system comprises a first acquisition subunit, a second acquisition subunit and a monitoring unit, wherein the first acquisition subunit acquires configuration files, the configuration files comprise data source configuration files and monitoring configuration files, the data source configuration files comprise data source configuration information of various data source types, and the monitoring configuration files comprise monitoring configuration information generated according to different monitoring requirements; the first processing subunit is used for instantiating the data source class according to the data source configuration file to generate a plurality of data sources; the second processing subunit is used for obtaining the data to be monitored of the user to be monitored according to the data source; and the third processing subunit is used for obtaining a monitoring threshold range according to the monitoring configuration file.

Optionally, the data source configuration information includes at least one of a data source type, a host address, a host interface, a user name, a password, and an encoding; and/or the monitoring configuration information comprises at least one of a name of a data source to be monitored, a name of a monitoring table to be monitored, a monitoring threshold range and a monitoring account.

According to a third aspect, an embodiment of the present invention provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to cause the at least one processor to perform the data monitoring method as described in any one of the above first aspects.

According to a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where computer instructions are stored, and the computer instructions are used to enable a computer to execute the data monitoring method described in any one of the first aspect.

The technical scheme of the embodiment of the invention has the following advantages:

the embodiment of the invention provides a data monitoring method, a data monitoring device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a user level and a task level of a user to be monitored, wherein the user level is used for representing the priority of the user to be monitored, and the task level is used for representing the priority of a monitoring task of the same user to be monitored; determining a monitoring execution sequence according to the user level and the task level; and monitoring the data quality of the data to be monitored of the user to be monitored according to the monitoring execution sequence. The user grades of different users to be monitored and the task grade of the same user to be monitored are subjected to priority sequencing to determine a monitoring execution sequence, and then data quality monitoring is performed on a plurality of tasks among a plurality of users according to the priority sequence, so that monitoring on a multi-user data table is realized, and the problem that the table monitoring tasks of the plurality of users cannot be considered simultaneously in the prior art is solved.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the current data quality monitoring, one monitor only allows one user to establish the logic of monitoring table data, namely, the data monitoring task can be performed only on the data of one user. The reason for this is that the data quality evaluation of each data table is closely related to the related service logic and development logic of the data, even if the same data table has different data quality requirements under different service scenes and different requirements, so that the specific monitoring of the specific data table is generally realized by a user directly by querying the table, and the user controls the data quality by himself, specifically, each user evaluates the data quality according to his own requirements, that is, evaluates the data quality of the data table produced by himself by specific development, so there is no data quality monitoring method and system that can be used by multiple users. Based on this, in this embodiment, according to the user classes of different users to be monitored and the task class of the same user to be monitored, priority ordering is performed on the monitoring tasks to determine a monitoring execution sequence, and then data quality monitoring is performed on multiple tasks among multiple users according to the priority sequence, so that monitoring on the multi-user data table is achieved.

Based on this, the embodiment of the present invention provides a data monitoring method, as shown in fig. 1, which may include steps S1-S3.

Step S1: the method comprises the steps of obtaining a user level and a task level of a user to be monitored, wherein the user level is used for representing the priority of the user to be monitored, and the task level is used for representing the priority of a monitoring task of the same user to be monitored.

As an exemplary embodiment, the user to be monitored may include a plurality of users, each user to be monitored has a high-low priority, and in order to monitor the data quality reasonably and effectively, the user with high priority is preferentially operated. In this embodiment, the user level may be divided into three levels, which are respectively the highest priority level of the user, the medium priority level of the user, and the lowest priority level of the user. For the convenience of program control, a parameter a may be set for the user level, where different values of the parameter a indicate different user priorities, and 0 may indicate the highest user priority, 1 indicates the medium user priority, and 2 indicates the lowest user priority.

The data quality monitoring of the same user to be monitored can also comprise a plurality of monitoring tasks, the plurality of monitoring tasks can also be set with different task grades so as to carry out quality monitoring on the data according to the task grades, and the task is preferentially executed when the task grade is higher. Specifically, the task level is also divided into three levels, namely, a task highest priority level, a task medium priority level and a task lowest priority level. Similarly, for the convenience of program control, a parameter B may be set for the task level, and different values of the parameter B indicate different task priorities, and 0 may indicate the highest task priority, 1 indicates the medium task priority, and 2 indicates the lowest task priority.

The division of the user level and the task level is not limited to the exemplary examples described in the above embodiments. Specifically, as an optional embodiment, the number of divisions of the user level and the task level may also be different, for example, the user level is divided into 2 levels, the task level is divided into 4 levels, and the division number is set reasonably as required in practical application.

Step S2: and determining a monitoring execution sequence according to the user level and the task level. And (4) queue sequencing the priority of each user, and sequencing a plurality of tasks of the same user according to the priority. As an exemplary embodiment, after the user level and the task level are obtained, the higher the priority is, the quality monitoring of the data is preferentially performed. For different users, the monitoring execution sequence is the user with high priority; for the same user, the monitoring execution sequence is that tasks with high priority are preferentially operated. Specifically, data quality monitoring is preferentially performed on data of a user with a high user level, and a task with a high task level is preferentially executed when data monitoring is performed on the user with the high user level.

Step S3: and monitoring the data quality of the data to be monitored of the user to be monitored according to the monitoring execution sequence. And multi-user and multi-task monitoring with priorities can be configured according to the monitoring execution sequence, and then data quality monitoring is performed according to the priorities of the multi-user and the multi-task according to the monitoring execution sequence.

Through the steps, the user grades of different users to be monitored and the task grade of the same user to be monitored are subjected to priority sequencing to determine a monitoring execution sequence, and then the data quality monitoring is carried out on a plurality of tasks among a plurality of users according to the priority sequence, so that the multi-user data table is monitored, and the problem that the table monitoring tasks of the plurality of users cannot be considered simultaneously in the prior art is solved.

As an exemplary embodiment, the step of determining the monitoring execution order according to the user level and the task level at step S2, as shown in fig. 2, includes steps S21-S23.

Step S21: and judging whether the user grades are the same. If the user grades are not the same, executing step S22; if the user ranks are the same, step S23 is executed. And judging the user grade, wherein the higher the user grade is, the monitoring task of the user is preferentially executed.

Step S22: and if the user grades are different, the monitoring execution sequence is that the user to be monitored with the high user grade has priority over the user to be monitored with the low user grade. And if the user levels are different, the monitoring task of the user with the high user level is preferentially executed, and then the monitoring task of the user with the low user level is executed, so that the task monitoring among different users is realized.

Step S23: and if the user grades are the same, sequentially monitoring according to the preset user sorting sequence.

In this embodiment, the preset user sorting order may be a default user order of the system, such as a user order in the monitoring task list (specifically, the user order in the written code), or an execution order determined according to an alphabetical sorting of the user name, and the like.

As an exemplary embodiment, the step of determining the monitoring execution order according to the user level and the task level at step S2, as shown in fig. 3, includes steps S24-S26.

Step S24: and judging whether the task grades of the monitoring tasks of the same user to be monitored are the same. If the task levels are not the same, go to step S25; if the task levels are the same, step S26 is executed. For a plurality of tasks of the same user, the higher the task level is, the monitoring of the task is preferentially executed, and the priority sequencing among the tasks is realized.

Step S25: and if the task levels are different, the monitoring task with the high task level is prioritized over the monitoring task with the low task level in the monitoring execution sequence. And if the task levels are different, monitoring of the task with the high task level is preferentially executed, and then monitoring of the task with the low task level is executed, so that task monitoring among a plurality of tasks of the same user is realized.

Step S26: and if the task grades are the same, sequentially monitoring according to the sequence of the preset monitoring task sequence.

In this embodiment, the preset monitoring task ordering order may be a default task order of the system, such as a task order in the monitoring task list (specifically, the task order may be a task order in a written code), or an execution order is determined according to an alphabetical ordering of task names, and the like.

Through the steps, the user grade is judged firstly, the monitoring execution sequence is determined according to the user grade, and the monitoring of the user is executed preferentially when the user grade is higher; then, in the monitoring of a plurality of tasks of a certain user, judging the task level, determining a monitoring execution sequence according to the task level, wherein the higher the task level is, the task is preferentially executed; the priority ordering among different users and the priority ordering among a plurality of tasks of the same user are realized.

As an exemplary embodiment, the step of determining the monitoring execution order according to the user level and the task level in the step S2 further includes steps S27 and S28.

Step S27: and if the user level is empty, determining the user level as the lowest user level. If the user does not specify the user priority of the user, the user priority of the user is defaulted to be the lowest user level.

Step S28: and if the task level is empty, determining the task level as the lowest task level. If a task does not have a task priority assigned, then the task priority is defaulted to the lowest task level.

And when the user level or the task level is not specified, determining the execution sequence of the monitoring tasks through the steps to ensure the ordered execution of the monitoring tasks.

As an exemplary embodiment, the step of monitoring data quality of the data to be monitored of the user to be monitored in the step S3 includes steps S31-S34 as shown in fig. 4.

Step S31: and acquiring data to be monitored and a monitoring threshold range of a user to be monitored. As an exemplary embodiment, the data to be monitored and the monitoring threshold range may be generated according to a configuration file, or may be directly provided by a user, which is only schematically described in this embodiment, and is not limited thereto.

In this embodiment, if the abnormal proportion of the data needs to be monitored, the monitoring threshold may be an allowable maximum proportion abnormal value, and the monitoring range is not more than the allowable maximum proportion abnormal value; if the number of pieces of data needs to be monitored, the monitoring threshold value can be the maximum allowable number, and the monitoring threshold value range is not more than the maximum allowable number; of course, in other examples, the monitoring of the data may also include a plurality of monitoring types, such as whether the monitoring data includes some special characters or numbers, whether the monitoring data is empty, and the like, and the data monitoring may be flexibly configured according to needs.

Step S32: and judging whether the data quality value of the data to be monitored is within the monitoring threshold range. If the data quality value is within the monitoring threshold range, executing step S33; if the data quality value is not within the monitoring threshold range, step S34 is executed.

Step S33: and if the data quality value is within the monitoring threshold range, executing the next monitoring task. And if the data quality value is within the monitoring threshold range, indicating that the data quality of the data in the data table is not abnormal, executing the next monitoring task until all the monitoring tasks are completed.

Step S34: and if the data quality value is not within the monitoring threshold range, sending data quality early warning information. If the data quality value is not within the monitoring threshold range, it indicates that the data quality of the data in the data table is abnormal, and alarm information needs to be sent out so as to process the abnormal data in time.

As an exemplary embodiment, the data quality warning information may be sent by mail, staple message, or short message, and the like, and this embodiment is only described schematically and is not limited thereto.

As a specific example, for monitoring the abnormal value ratio value in the data table, the monitoring threshold range may be determined according to the maximum ratio abnormal value allowed by monitoring in the configuration file, and if the abnormal value ratio value in the data table is greater than the maximum ratio abnormal value allowed, it is determined that the data in the data table is abnormal, and an alarm is issued; if the abnormal value proportion value in the data table is smaller than or equal to the maximum allowable proportion abnormal value, the data in the data table is normal, and the data monitoring can be continuously performed.

According to the method, if the data quality value of the data to be monitored is within the monitoring threshold range, the data quality is proved to have no problem, data monitoring can be continuously performed, and the next task is automatically performed; if the data quality value of the data to be monitored is not within the monitoring threshold range, it is indicated that the data quality is in problem, alarm information needs to be sent out, and a relevant data responsible person is informed in time. Through the steps, various kinds of monitoring on the data can be realized, effective monitoring on abnormal data is realized, and the data quality is ensured.

As an exemplary embodiment, the step of acquiring the data to be monitored and the monitoring threshold range of the user to be monitored in step S31 includes steps S311 to S314 as shown in fig. 5.

Step S311: acquiring configuration files, wherein the configuration files comprise a data source configuration file and a monitoring configuration file; the data source configuration file comprises data source configuration information of various data source classes; the monitoring configuration file comprises monitoring configuration information generated according to different monitoring requirements.

As an exemplary embodiment, the configuration file includes two files, a data source configuration file and a monitoring configuration file. The data source configuration file comprises data source configuration information of various data source classes, and the data sources such as a relational database, a big data platform and the like can be connected in a butt joint mode according to the data source configuration file, so that the various data source classes are realized. The relational database can be MySQL, Oracle and the like, and the big data platform can be Hbase, Hive and the like. The monitoring configuration file comprises monitoring configuration information generated according to different monitoring requirements, and data quality monitoring is carried out on the data according to the monitoring configuration information.

Specifically, the data source configuration information includes at least one of a data source type, a host address, a host interface, a user name, a password, and an encoding. In this embodiment, the data source configuration information includes a data source type, a host address, a host interface, a user name, a password, and an encoding. In other exemplary embodiments, the data source configuration information may also include other information, such as a host name, which may be set appropriately as needed.

In this embodiment, the configuration of the data source is implemented based on Python, and the specific process is that the data source must use data _ source as the beginning of the configuration item, and then a _ "and other serial numbers are added to distinguish each data source; of course, in other exemplary embodiments, the implementation may be based on other programming languages, such as C, C + + or java, etc., and different programming languages have different programming syntax, as long as the corresponding programming syntax is met.

Configuration of data sources: the data source must begin with a data _ source, specifying the data source type, host address, host interface, user name, password, and encoding; demo was configured as follows:

[data_source_1]

data _ source _ type mysql// specifying data source type

host/specified data source host address

Port/designated data source host port

user zhangsan// specifying data source user name

password 123456// specifying data source password

charset// specifying data source encoding

A plurality of data sources, such as data _ source _2 and data _ source _3, can be generated by acquiring data source configuration information in a configuration file;

the data type may specify a variety of common data sources such as MySQL, DMADB, HIVE, HDFS, etc., among others.

As a specific example, the following description shows an example, where the data source configuration file includes two data sources, namely mysql and hive, and the related data source configuration information in the data source configuration file is as follows:

(; behavioral code comments at the beginning, mainly to facilitate code understanding)

(ii) a Data source

(ii) a The data source must begin with a' data _ source

[data_source_1]

(ii) a Specifying data source types

data_source_type=mysql

(ii) a Specifying data source host addresses

host=********

(ii) a Specifying data source host port

port=3306

(ii) a Specifying data source username

user=bi_reader

(ii) a Specifying data source password

password=********

(ii) a Specifying data source encoding

charset=utf8

(ii) a The new HIVE data source only needs to specify data _ source _ type ═ HIVE

[data_source_2]

data_source_type=hive

Alternatively, the configuration Demo of HIVE may also be as follows:

(ii) a HIVE data Source

[data_source_2]

data _ source _ type ═ hive// specifying data source type

host/specified data source host address

Port/designated data source host port

user zhangsan// specifying data source user name

password 123456// specifying data source password

charset// specifying data source encoding

As an exemplary embodiment, the monitoring configuration information includes at least one of a name of a data source to be monitored, a name of a monitoring table to be monitored, a monitoring threshold range, and a monitoring account. In this embodiment, the monitoring configuration information includes a name of a data source to be monitored, a name of a monitoring table to be monitored, a monitoring threshold range, and a monitoring account (account information corresponding to an alarm when monitoring is abnormal). In other exemplary embodiments, the monitoring configuration information may also include other information, which may be flexibly configurable, if desired, simply by implementing the relevant logic in the code.

Data source to be monitored: and specifying a data source of the table to be monitored, and filling out a configuration item name corresponding to the data _ source. The name of the monitoring table to be monitored is as follows: and specifying a data table to be monitored, wherein a database name is added before the name of the data table and is separated from the name of the data table by using the mark. Monitoring threshold range: adopting json array form configuration, multiple jsons can be configured, and each json is assigned with related information of a monitoring field; in each json there must be a field entry specifying the data field to be monitored, e.g. the maximum ratio of null and null allowed for this field, specified by threshold, in the range of [0,1 ]. Monitoring an account: and designating an account for receiving detailed warning, namely account information corresponding to the alarm.

Configuration of the monitor: the monitor must start with monitor, specify the monitor data source name, the monitor monitoring table name, the monitoring threshold, the monitoring account number, and the minimum number of rows of the monitoring table (each monitor should specify only positive integers and 0); demo was configured as follows:

[monitor_2]

data _ source// specifying monitor data source name

monitor _ table// designation monitor monitoring table name

field _ threshold [ { ' field ': status ', ' threshold ':0} ]// specifies the maximum tolerable proportion of the field name and abnormal data monitored by the monitor, and is in the format of json list, a plurality of jsons can be embedded in one list, and each json must have ' field ' and ' threshold ' fields

owner ═ x ·/account specifying monitor creator

partition _ field// partition field name specifying partition distinction

partition// generation rule for specifying time partition key

min _ rows// minimum number of rows of allowed tables

Multiple monitors may be generated by obtaining monitoring configuration information within a configuration file.

As a specific example, if the monitoring configuration file includes a data table that needs to be monitored from MySQL and a data table of Hive, the specific configuration of the monitoring configuration file is as follows:

(; behavioral code comments at the beginning, mainly to facilitate code understanding)

(ii) a Monitor unit

(ii) a The monitor must start with' monitor

[monitor_1]

(ii) a Specifying monitor data source name

data_source=data_source_1

(ii) a Name of monitoring table of appointed monitor

monitor_table=******

(ii) a The maximum tolerance ratio of the field names monitored by the monitor and the abnormal data is specified, the monitor is in a json list format, a plurality of jsons can be embedded in one list, and each json must have 'field' and 'threshold' fields

field_threshold=[{'field':'********','threshold':1}]

(ii) a Specifying a pinned account for a monitor creator

owner ═ x ·/account specifying monitor creator

min _ rows// minimum number of rows of allowed tables

[monitor_lsq]

data _ source _2// specifying monitor data source name

monitor _ table// designation monitor monitoring table name

field _ threshold [ { 'field': 0} ]// specifies the maximum tolerable ratio of the field name monitored by the monitor and the abnormal data, and is in a json list format, a plurality of jsons can be embedded in one list, and each json must have a 'field' and a 'threshold' field

owner ═ x ·/account specifying monitor creator

(ii) a partition _ field specifies the partition field names of the partitions

partition_field=********

(ii) a partition specifies the calculation rules for the time partition key, 0 for the day, -1 for the day before, 1 for the day next, and so on, but positive, negative and 0

partition/rule for generating a time-partition bond

min _ rows// minimum number of rows of allowed tables

Step S312: and instantiating the data source class according to the data source configuration file to generate a plurality of data sources.

In this embodiment, after the MySQL and Spark data sources are instantiated, a join object is generated for both data sources for use in subsequent steps.

Step S313: and obtaining the data to be monitored of the user to be monitored according to the data source. The instantiated data source connection object can realize the connection of the data source, and further can obtain the data to be monitored in the data source.

Step S314: and obtaining a monitoring threshold range according to the monitoring configuration file. The monitoring configuration file comprises a monitoring threshold range, the monitoring threshold range is used as a standard for judging data abnormity, and if the data quality value of the data to be monitored is within the monitoring threshold range, the data quality of the data is good; if the data quality value of the data to be monitored is not within the monitoring threshold range, it indicates that the data is abnormal, and the abnormal data needs to be processed, and the specific processing process may be sending corresponding data quality early warning information to a preset monitoring account, for example, sending a corresponding alarm mail to the monitoring account. The data monitoring can be configured through the steps, the function of monitoring the abnormal values of the table fields can be realized only by modifying the configuration files and adding the required configuration items, and the quality of the monitoring data is improved.

As an exemplary embodiment, the upper limit value of the monitoring threshold range may be a maximum ratio abnormal value allowed by data monitoring, the allowed ratio of the abnormal value may be set by a user, if it is detected that the ratio of the data abnormal value in the monitoring table is greater than the allowed ratio set by the user, an alarm is given, otherwise, no alarm is given; if no abnormity exists, the next task is automatically executed.

Through the steps, the diversification of the types of the data sources is realized, and most common data sources (such as MySQL, DMADB, HIVE, HDFS and the like) can be fused; the monitoring content of data monitoring can be freely configured, and different data monitoring functions can be realized by only modifying the configuration file and adding the required configuration items by a user, so that the flexibility of data monitoring is improved; the method overcomes the defect that the data quality can not be configured by monitoring the data quality through a timing task and an SQL statement in the industry at present.

In this embodiment, a data monitoring apparatus is further provided, and the apparatus is used to implement the foregoing embodiments and preferred embodiments, and the description of which has been already made is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

The present embodiment further provides a data monitoring apparatus, as shown in fig. 6, including: an

acquisition module

61, a

first processing module

62 and a

second processing module

63.

The acquiring

module

61 is configured to acquire a user level and a task level of a user to be monitored, where the user level is used to represent a priority of the user to be monitored, and the task level is used to represent a priority of a monitoring task of the same user to be monitored; the details are described with reference to step S1.

A

first processing module

62, configured to determine a monitoring execution sequence according to the user level and the task level; the details are described with reference to step S2.

The

second processing module

63 is configured to perform data quality monitoring on the data to be monitored of the user to be monitored according to the monitoring execution sequence; the details are described with reference to step S3.

As an exemplary embodiment, the first processing module includes: a first judgment unit configured to judge whether the user ranks are the same, the details of which are described with reference to step S21; a first processing unit, configured to, if the user levels are different, prioritize a user to be monitored with a high user level over a user to be monitored with a low user level in a monitoring execution sequence, where the detailed content refers to step S22; and a second processing unit, configured to, if the user levels are the same, sequentially perform monitoring according to a preset user sorting order in a monitoring execution order, where details refer to step S23.

As an exemplary embodiment, the first processing module further comprises: a second judging unit, configured to judge whether the task levels of the monitoring tasks of the same user to be monitored are the same, where the detailed content refers to step S24; a third processing unit, configured to prioritize a monitoring task with a high task level over a monitoring task with a low task level in a monitoring execution order if the task levels are different, where the detailed content refers to step S25; and a fourth processing unit, configured to monitor, if the task levels are the same, the monitoring execution order is an order in which preset monitoring tasks are ordered, and the detailed contents refer to step S26.

As an exemplary embodiment, the first processing module further comprises: a fifth processing unit, configured to determine that the user level is a lowest user level if the user level is empty, and refer to step S27 for details; and/or a sixth processing unit, configured to determine that the task level is the lowest task level if the task level is empty, where details are described with reference to step S28.

As an exemplary embodiment, the second processing module includes: a first obtaining unit, configured to obtain data to be monitored and a monitoring threshold range of the user to be monitored, where details refer to step S31; a third determining unit, configured to determine whether a data quality value of the data to be monitored is within a monitoring threshold range, where details refer to step S32; and a seventh processing unit, configured to execute a next monitoring task if the data quality value is within the monitoring threshold range, where details are described with reference to step S33.

As an exemplary embodiment, the first acquisition unit includes: a first obtaining subunit, configured to obtain a configuration file, where the configuration file includes a data source configuration file and a monitoring configuration file, where the data source configuration file includes data source configuration information of multiple data source classes, the monitoring configuration file includes monitoring configuration information generated according to different monitoring requirements, and the detailed content refers to step S311; a first processing subunit, configured to instantiate a data source class according to the data source configuration file, to generate multiple data sources, and refer to step S312 for details; a second processing subunit, configured to obtain data to be monitored of the user to be monitored according to the data source, and refer to step S313 for details; a third processing subunit, configured to obtain a monitoring threshold range according to the monitoring configuration file, where the detailed content refers to step S314.

As an exemplary embodiment, the data source configuration information includes at least one of a data source type, a host address, a host interface, a user name, a password, and an encoding;

and/or the presence of a gas in the gas,

the monitoring configuration information comprises at least one of a name of a data source to be monitored, a name of a monitoring table to be monitored, a monitoring threshold range and a monitoring account.

The data monitoring apparatus in this embodiment is presented in the form of a functional unit, where the unit refers to an ASIC circuit, a processor and a memory executing one or more software or fixed programs, and/or other devices that can provide the above-described functions.

Further functional descriptions of the modules are the same as those of the corresponding embodiments, and are not repeated herein.

An embodiment of the present invention further provides an electronic device, as shown in fig. 7, the electronic device includes one or

more processors

71 and a

memory

72, where one

processor

71 is taken as an example in fig. 7.

The controller may further include: an

input device

73 and an

output device

74.

The

processor

71, the

memory

72, the

input device

73 and the

output device

74 may be connected by a bus or other means, as exemplified by the bus connection in fig. 7.

The

processor

71 may be a Central Processing Unit (CPU). The

Processor

71 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or combinations thereof. A general purpose processor may be a microprocessor or any conventional processor or the like.

The

memory

72, which is a non-transitory computer readable storage medium, can be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the data monitoring method in the embodiments of the present application. The

processor

71 executes various functional applications of the server and data processing by running non-transitory software programs, instructions and modules stored in the

memory

72, namely, implements the data monitoring method of the above-described method embodiment.

The

memory

72 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of a processing device operated by the server, and the like. Further, the

memory

72 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the

memory

72 may optionally include memory located remotely from the

processor

71, which may be connected to a network connection device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The

input device

73 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the processing device of the server. The

output device

74 may include a display device such as a display screen.

One or more modules are stored in the

memory

72 and, when executed by the one or

more processors

71, perform the methods shown in fig. 1-5.

It will be understood by those skilled in the art that all or part of the processes of the method according to the above embodiments may be implemented by instructing relevant hardware through a computer program, and the executed program may be stored in a computer-readable storage medium, and when executed, may include the processes according to the above embodiments of the data monitoring method. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD), a Solid State Drive (SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.

Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.