patents.google.com

CN113759841B - Multi-objective optimized machine tool flexible workshop scheduling method and system - Google Patents

️Fri Jan 12 2024

Multi-objective optimized machine tool flexible workshop scheduling method and system Download PDF

Info

Publication number

CN113759841B

CN113759841B CN202110986700.2A CN202110986700A CN113759841B CN 113759841 B CN113759841 B CN 113759841B CN 202110986700 A CN202110986700 A CN 202110986700A CN 113759841 B CN113759841 B CN 113759841B Authority

China

Prior art keywords

machine tool

dqn

eda

scheduling method

time

Prior art date

2021-08-26

Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)

Active

Application number

CN202110986700.2A

Other languages

Chinese (zh)

Other versions

CN113759841A (en

Inventor

杜宇

李俊青

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Shandong Normal University

Original Assignee

Shandong Normal University

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2021-08-26

Filing date

2021-08-26

Publication date

2024-01-12

2021-08-26 Application filed by Shandong Normal University filed Critical Shandong Normal University

2021-08-26 Priority to CN202110986700.2A priority Critical patent/CN113759841B/en

2021-12-07 Publication of CN113759841A publication Critical patent/CN113759841A/en

2024-01-12 Application granted granted Critical

2024-01-12 Publication of CN113759841B publication Critical patent/CN113759841B/en

Status Active legal-status Critical Current

2041-08-26 Anticipated expiration legal-status Critical

Classifications

- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B19/00—Programme-control systems
- G05B19/02—Programme-control systems electric
- G05B19/418—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM]
- G05B19/41885—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM] characterised by modeling, simulation of the manufacturing system
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
- G05B2219/32—Operator till task planning
- G05B2219/32339—Object oriented modeling, design, analysis, implementation, simulation language
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/02—Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

Engineering & Computer Science (AREA)
Manufacturing & Machinery (AREA)
General Engineering & Computer Science (AREA)
Quality & Reliability (AREA)
Physics & Mathematics (AREA)
General Physics & Mathematics (AREA)
Automation & Control Theory (AREA)
General Factory Administration (AREA)
Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure provides a multi-objective optimized machine tool flexible workshop scheduling method, including initializing a target networkAnd an online network Q; performing depth Q network DQN training by using an excitation function; initializing a population by an initialization strategy; and performing non-dominant ranking and crowding ranking and obtaining a better solution. The present disclosure exploits the advantages of EAs and DRL, proposing a hybrid algorithm EDA-DQN for FJSP based on knowledge distribution estimation algorithms (EDA) and DQN. The maximum finishing time and the total electricity charge are optimized by using a Pareto-based method. In EDA-DQN, the DQN part is a local search selector, which is responsible for selecting proper local search strategies under different scheduling states; the EDA portion is used to improve the exploratory capacity of the algorithm and the DQN portion is used to improve the mining capacity of the algorithm.

Description

Multi-objective optimized machine tool flexible workshop scheduling method and system

Technical Field

The disclosure relates to the technical field of workshop scheduling, in particular to a multi-objective optimized machine tool flexible workshop scheduling method and system.

Background

The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.

The flexible workshop scheduling problem (FJSP) has been widely studied and applied in the fields of semiconductor manufacturing, chemical processes, cell phone assembly, etc. In practical production, the influence of time-of-use electricity prices (TOUEP) should be considered in economic and green development, and therefore, the electricity is preferably used in a low electricity price region more appropriately. Further, the processing speed setting of the machine tool should be balanced between the production plan and the profit of the enterprise, the process with high power consumption should be arranged in the low power price region, and the process with low power consumption can be properly arranged in the high power price region.

The current prior art solves the FJSP of various constraints, but has the problem that performance in large scale problems is drastically reduced. To solve this problem, it is considered to enhance the optimization capability for FJSP with Reinforcement Learning (RL) which is currently considered to have a strong intelligent optimization capability. Recently, deep Reinforcement Learning (DRL) combining RL and deep learning is increasingly applied to solve the Combined Optimization Problem (COPs), showing the powerful ability of DRL to solve COPs.

Several studies have been performed to optimize the parameters of EAs using RL. Emary et al optimized GWO parameters using RL and neural networks. Cao et al propose a cuckoo search algorithm with RL and proxy models to solve the scheduling problem of semiconductor final testing, where RL is used to ensure diversity of populations and search strength; although RL has been applied to enhance optimization by adjusting the parameter values of EAs, the intelligent benefits of RL are not directly involved in the optimization process.

Recently, more research has directed RL as the strategy for optimization. The Lin et al scholars utilize a multi-stage DQN to solve the shop scheduling problem, optimizing the maximum finishing time therein. Park et al propose using the setup-switch scheduling theory of RL to minimize the maximum finishing time in chip production. Luo designed the DQN algorithm for dynamic FJSP provides a basic algorithm for RL solution to FJSP. Hu et al have solved the flexible workshop production problem with the vehicle automatic navigation scheduling method of taking the mixing rule, have optimized maximum time of finishing and delay rate. He et al solve the problem of scheduling agile satellites using DQN models. Park et al scholars combine graph neural networks with the RL to solve the shop scheduling problem, where the RL-based near-end policy optimization is used to train the model. Zhao et al propose a RL-based collaborative water wave optimization algorithm to solve the problem of distributed assembly wait-free flow shop scheduling. Han and Yang constructed an end-to-end DRL framework to solve FJSP, where an improved pointer network was employed to encode the procedure, and a convolutional neural network was used as the decoding network. Kim and Lee propose Petri networks as an environment in the RL to solve the flow shop scheduling problem, optimizing the maximum completion time. Xu and other scholars design a differential evolution algorithm based on RL to solve the problem of multi-stage energy consumption scheduling of the industrial integrated energy consumption system. In the above studies COPs was solved by various methods of RL, the intelligence of RL was exploited, however, most studies only optimized the single-objective problem and the RL method for the multi-objective problem needs further investigation. In addition, the advantages of EAs and RL need to be combined to better improve the optimization performance of the algorithm.

Disclosure of Invention

In order to solve the defects of the prior art, the present disclosure provides a multi-objective optimized machine tool flexible workshop scheduling method and system, and provides a mixed algorithm EDA-DQN of a knowledge-based distribution estimation algorithm (EDA) and DQN for FJSP by utilizing the advantages of EAs and DRL. The maximum finishing time and the total electricity charge are optimized by using a Pareto-based method. In EDA-DQN, the DQN part is a local search selector, which is responsible for selecting proper local search strategies under different scheduling states; the EDA portion is used to improve the exploratory capacity of the algorithm and the DQN portion is used to improve the mining capacity of the algorithm.

In order to achieve the above purpose, the present disclosure adopts the following technical scheme:

the first aspect of the present disclosure provides a multi-objective optimized machine tool flexible shop scheduling method.

A multi-objective optimized machine tool flexible workshop scheduling method comprises the following steps:

acquiring workshop machine tool constraints, including machine tool processing speed, machine tool preparation time, idle time, time-of-use electricity price and carrying time;

optimizing the maximum finishing time and the total electricity charge by using an EDA-DQN model;

obtaining an optimization result;

the EDA-DQN model optimizes the maximum finishing time and the total electricity charge, and specifically comprises the following steps:

initializing a target networkAnd an online network Q;

performing depth Q network DQN training by using an excitation function;

initializing a population by an initialization strategy;

and performing non-dominant ranking and crowding ranking and obtaining a better solution.

Further, the population initialization includes designing a first initialization strategy, a second initialization strategy, and a random initialization strategy based on a Scheduling Vector (SV), a machine tool allocation vector (MAV), and a Process Speed Vector (PSV).

Further, the population is initialized, one third of the population individuals are obtained according to a first initialization strategy, one third of the population individuals are obtained according to a second initialization strategy, and one third of the population individuals are obtained according to a random initialization strategy.

Further, in the DQN, the number of input and output nodes coincides with the number of feature vectors and actions.

Further, the DQN training specifically includes: a fully-connected neural network with 2 hidden layers is adopted, wherein the number of hidden layer nodes is 20; the excitation function from the input layer to the hidden layer is a ReLU function, and the excitation function of the output layer is a purelin function.

Further, the DQN training further comprises:

according toUpdating the parameter theta of the online network Q by utilizing gradient descent to obtain a target network +.>

Further, the non-dominant ranking and congestion degree ranking, for both solutions x and y, if (C _max,x ≤C _max,y )∧(TEC _x <TEC _y )

Or (b)

(C _max,x <C _max,y )∧(TEC _x ≤TEC _y ) X dominates y;

otherwise, x and y do not govern each other.

A second aspect of the present disclosure provides a coordinated optimization operating system for an electric heating gas interconnection system.

A multi-objective optimized machine tool flexible shop scheduling system, comprising:

an initialization module configured to: initializing a target network and an online network;

a network training module configured to: performing depth Q network DQN training by using an excitation function;

an initialization policy module configured to: initializing a population by an initialization strategy;

a ranking module configured to: non-dominant ranking and congestion degree ranking.

A third aspect of the present disclosure provides a medium having stored thereon a program which when executed by a processor implements the steps in a multi-objective optimized machine tool flexible shop scheduling method according to the first aspect of the present disclosure.

A fourth aspect of the present disclosure provides an apparatus comprising a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing steps in a multi-objective optimized machine tool flexible shop scheduling method according to the first aspect of the present disclosure when the program is executed.

Compared with the prior art, the beneficial effects of the present disclosure are:

the present disclosure exploits the advantages of EAs and DRL, proposing a hybrid algorithm EDA-DQN for FJSP based on knowledge distribution estimation algorithms (EDA) and DQN. The maximum finishing time and the total electricity charge are optimized by using a Pareto-based method. In EDA-DQN, the DQN part is a local search selector, which is responsible for selecting proper local search strategies under different scheduling states; the EDA part is used for improving the exploratory capacity of the algorithm, and the DQN part is used for improving the mining capacity of the algorithm;

the present disclosure contemplates FJSP including constraints of variable processing speed, setup time, idle time, workpiece handling, time-of-use electricity prices, etc.; optimizing the maximum finishing time and the total electricity charge by using a mixed algorithm of EDA and DQN; 34 state features and 9 actions are designed in the DQN section to enhance the mining ability of the algorithm, and the problem-based EDA section enhances the exploration ability of the algorithm.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate and explain the exemplary embodiments of the disclosure and together with the description serve to explain the disclosure, and do not constitute an undue limitation on the disclosure.

Fig. 1 is a schematic flow chart of an EDA-DQN model provided in example 1 of the present disclosure.

Fig. 2 is a coding diagram of a solution provided in embodiment 1 of the present disclosure.

Fig. 3 is a state feature diagram considered by the EDA-DQN model provided in example 1 of the present disclosure.

Fig. 4 is a graph of the result of the parameter correction experiment provided in example 1 of the present disclosure.

Fig. 5 is a diagram of comparison results of an initialization strategy provided in embodiment 1 of the present disclosure.

Fig. 6 is a graph of the exploration and excavation ability versus results provided by example 1 of the present disclosure.

Fig. 7 is a diagram of comparison results of action selection policies provided in embodiment 1 of the present disclosure.

Fig. 8 is a graph of algorithm performance versus results provided in example 1 of the present disclosure.

Detailed Description

The disclosure is further described below with reference to the drawings and examples.

It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the present disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments in accordance with the present disclosure. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.

Embodiments of the present disclosure and features of embodiments may be combined with each other without conflict.

Example 1:

as shown in fig. 1, embodiment 1 of the present disclosure provides a multi-objective optimized machine tool flexible shop scheduling method, including obtaining shop machine tool constraints including machine tool processing speed, machine tool preparation time, idle time, time-of-use electricity price, and handling time;

optimizing the maximum finishing time and the total electricity charge by using an EDA-DQN model;

obtaining an optimization result;

the EDA-DQN model optimizes the maximum finishing time and the total electricity charge, and specifically comprises the following steps:

initializing a target networkAnd an online network Q;

performing depth Q network DQN training by using an excitation function;

initializing a population by an initialization strategy;

and performing non-dominant ranking and crowding ranking and obtaining a better solution.

The method comprises the following steps:

1. encoding and decoding

FJSP encoding contemplated by the present disclosure is designed as a three-dimensional encoding strategy. First, the processing order of all the processes is controlled by a Scheduling Vector (SV), where each element is the workpiece number of the corresponding process, and the j-th occurring workpiece number represents the j-th process of the workpiece. Then, machine tool machining of all the processes is controlled by a machine tool allocation vector (MAV), wherein the machine tool number of each process corresponds to a process in SV one by one. Finally, the process speeds of all the processes are controlled by a Process Speed Vector (PSV), wherein the number of process speed steps for each process corresponds one-to-one to the position in the MAV. Figure 2 shows the encoding of a3 workpiece-3 machine FJSP solution.

In the decoding process, the processing sequence of all the processes is calculated according to the SV in sequence, and the start and end time of all the processes are determined by the same workpiece preamble process, machine tool preparation time, machine tool idle time and conveyance time. In the target value calculation, the maximum completion time is determined by the end time of the last completed process, and the total electric charge is the sum of all electric charge consumptions under the TOUEP setting.

2. Initializing a population:

reasonable initialization strategies can provide a good environment for exploration and mining capabilities. The present disclosure designs three different initialization strategies based on SV, MAV and PSV. In order to obtain a diversified initialized population, one third of population individuals are obtained according to an initialization strategy 1, one third of population individuals are obtained according to an initialization strategy 2, and the rest population individuals are obtained according to a random initialization strategy. In all initialization strategies, all procedures are distributed to machine tools capable of being processed, and the legitimacy of knowledge is guaranteed.

For the initialization strategy 1, the time and power consumption from the handling, machine preparation and idle processes are reduced by this strategy, according to the nature 1, the next determined procedure in sv from the last scheduled work piece that is not completed. The corresponding machine tool is the same as the dispensing machine tool of the preceding process. In order to reduce the total power consumption, a lower instructor speed tends to be selected, from which all of the process speed steps in the PSV are randomly selected.

The initialization strategy 2 is identical to the initialization strategy 1 except for the decision part of the PSV. To reduce the maximum finishing time, all process speed steps in the PSV of initialization strategy 2 are randomly selected from.

3. Distribution estimation algorithm

Corresponding to SV, MAV and PSV in the code, the probability matrix of FJSP in the present disclosure is A _mi 、B _ijk And C _ijs . The process of initializing, updating and generating solutions for the probability matrices described above is described below.

3.1 initialization and updating of probability matrices

A _mi Probability of being set to the mth position of the SV for the process of the workpiece i. A is that _mi Initialization is performed according to equation (47). A is that _mi The update rule at one iteration is of formula (48), where I _mi Defined by equation (49), SI is the number of all excellent solutions, and the update rate α is 0.5. SI was half the population in this study.

A _mi (Gen＝0)＝1/I

B _ijk Is O _i,j Probability of machining by machine tool k. B (B) _ijk Initialization is performed according to equation (50). B (B) _ijk The update rule at one iteration is of formula (51), wherein I _ijk The update rate β is 0.5, defined by equation (52).

C _ijs Is O _i,j Probability of performing the machining at the s-th speed level. C (C) _ijs Initialization is performed according to equation (53). C (C) _ijs The update rule at one iteration is of formula (54), where I _ijs The update rate γ is 0.5, which is defined by equation (55).

C _ijs (Gen＝0)＝1/S

3.2 Generation of solutions

The SV, MAV and PSV of the solution can be based onProbability matrix A _mi 、B _ijk And C _ijs Sequentially generating. With respect to A _mi If all the working procedures of one workpiece are finished, the probability of the subsequent workpiece processing is 0; with respect to B _ijk If machine tool k cannot process O _i,j Then B is _ijk Is always 0; with respect to C _ijs The speed distribution always obeys the probability distribution.

4. Deep reinforcement learning

4.1 State characterization

The state characterization knows the state at the point of decision, and therefore the reality of the state characterization should reflect the effect of the action, helping the model to select the action.

The DQN section of the present disclosure uses 34 state features that are divided into three groups. The first group includes state features of problem scale, taking into account the number of work pieces, the number of machine tools, the number of steps for all work pieces, and the standard deviation OP for all machine tool allocation steps _std Etc. The second group includes state characteristics associated with the target values, taking into account the maximum finishing time, the average and standard deviation of finishing times of the respective machine tools, the total electric charge, and the average and standard deviation of electric charges on the respective machine tools. The third group includes state features associated with the schedule element, taking into account machine tool processing state, idle state, ready state, and time and total electricity charge of the handling process, as well as the average and standard deviation of the corresponding variables. All status features are shown in fig. 3.

4.2. Action

In EDA-DQN, the present study designed 9 actions based on the nature of the problem, so actions can effectively improve the quality of the resulting solution. To enhance the exploration ability of DQN, three actions are included to generate multiple solutions.

First action a ₁ Designed according to property 2 and property 3, the maximum finishing time and the total electricity charge can be optimized simultaneously. Algorithm 1 describes a ₁ Is a process of (2).

According to the natureMass 4, second action a ₂ The working procedure and processing sequence of the same machine tool on a critical path is changed, and the idle time and the power consumption of the machine tool are reduced. Algorithm 2 describes a ₂ Is a process of (2).

The third action a3 is designed according to property 4 and property 5, and can change the machine tool distribution of the working procedures on the critical path. Algorithm 3 describes a ₃ Is a process of (2).

Action a ₄ 、a ₅ And a ₆ Is a classical mutation operator action. a, a ₄ The process steps from two different workpieces on the SV are exchanged, and the machine tool allocation and speed setting remain unchanged. Analogue a ₄ ，a ₅ Two pairs of procedures are randomly selected for exchange. a, a ₆ Two processes are randomly selected, and the machine tool allocation and speed setting remain unchanged until the process of the latter process is moved to the previous process.

Action a in order to enhance the explorability of the model ₇ 、a ₈ And a ₉ Set to a random strategy. a, a ₇ The machine tool allocation and speed setting were kept unchanged by adjusting the processing sequence of a random sequence of steps on SV. a, a ₈ Only the allocation of machine tools for a random process is changed, and the newly allocated machine tools can process the process. a, a ₉ Only the processing speed of a random process is changed.

4.3. Rewards and rewards

At decision point t, rewards for the DQN portion are defined as in algorithm 4.

5. DQN structure and training procedure

In DQN, the number of input and output nodes corresponds to the number of feature vectors and actions. A fully connected neural network with hidden layer 2 is used, with hidden layer nodes number 20. The excitation function from the input layer to the hidden layer is a ReLU function, and the excitation function of the output layer is a purelin function. The training process for DQN is described as algorithm 5:

6. non-dominant ranking and congestion degree ranking

For both solutions x and y, x dominates y if (Cmax, x+.Cmax, y)/(TECx < TECy) or (Cmax, x < Cmax, y)/(TECx+.TECy); otherwise, x and y do not govern each other.

Example 2:

embodiment 2 of the present disclosure provides a multi-objective optimized machine tool flexible shop scheduling system, comprising:

an initialization module configured to: initializing a target network and an online network;

a network training module configured to: performing depth Q network DQN training by using an excitation function;

an initialization policy module configured to: initializing a population by an initialization strategy;

a ranking module configured to: non-dominant ranking and congestion degree ranking.

The working method of the system is the same as the coordinated and optimized operation method of the electric heating gas interconnection system provided in embodiment 1, and is not repeated here.

Example 3:

embodiment 3 of the present disclosure provides a medium having stored thereon a program which, when executed by a processor, implements steps in a multi-objective optimized machine tool flexible shop scheduling method as described in embodiment 1 of the present disclosure.

The detailed steps are the same as those of the electric heating gas interconnection system provided in embodiment 1, and are not repeated here.

Example 4:

embodiment 4 of the present disclosure provides an apparatus including a memory, a processor, and a program stored on the memory and executable on the processor, which when executed implements steps in a multi-objective optimized machine tool flexible shop scheduling method according to embodiment 1 of the present disclosure.

The detailed steps are the same as those of the electric heating gas interconnection system provided in embodiment 1, and are not repeated here.

Example 5 experimental analysis

5.1. Practical calculation example

The present disclosure obtains research examples by means of random generation, and the examples are divided into three groups. The first set includes 30 examples, where the number of tools {10,20,30,40,50,60,80,100,150,200}, the number of machine tools {5,8,10}. The second class includes 12 examples, where the number of tools {4,6,8,10}, the number of machine tools {2,3,5}. The third class of examples is the first class of example set, comprising 10 examples, where the number of machine tools is 8. The example "10J5M" indicates that the example includes 10 workpieces and 5 machine tools.

5.2. Evaluation criteria

The present disclosure utilizes hyper-volume (HV) and reverse generation distance (IGD) to compare the performance of the proposed algorithm with other algorithms.

In the calculation of HV, the two targets are normalized according to equation (56). Where f' is the ith normalized target value, f _i For the ith target value, f _i,max Representing the maximum value of the ith target value.

f _i ′＝f _i /f _i,max

The reference point for HV calculation is set to (3, 3) and the HV value is calculated as follows.

Where λ (·) is the lebeger measure, NS represents the non-dominant solution set, v _i Representing non-dominant solution pi with reference pointHV value between.

The IGD calculation formula is as follows.

Where PF is the true pareto front, |j-i| is the euclidean distance between the j solution on PF to the i solution on NS.

The Relative Percentage Increment (RPI) is used to compare standard values for different algorithms, and the calculation formula is as follows.

In the above formula, f _c Calculating the average number of the standard values for the algorithm, f _min Is the optimal value of the average standard value of all algorithms.

5.3. Parameter correction

The EDA-DQN model is most affected by the action-selected softmax strategy parameter μ. Thus, the present disclosure corrects only the parameter μ. A larger μ can select actions with a larger Q value, but tend to fall into local optima; smaller μ favors random selection actions but loses knowledge-based policy information. Fig. 4 shows the trend of HV and IGD at values of μ from 1.1 to 2.0, and it can be seen from fig. 4 that both HV and IGD achieve the best optimization when μ is 1.5. The study therefore selected a μ value of 1.5 in the following experimental study.

5.4. Initializing policy comparison experiments

To verify the effectiveness of the initialization strategy in this disclosure, the algorithm of the initialization strategy is compared to the algorithm of the random initialization strategy (R-init). Analysis of variance of RPI values for HV and IGD as shown in fig. 5, it can be seen that DQN-EDA with initialization strategy is better than random initialization.

5.5. Comparative experiments on exploration and excavation capacities

To compare the exploration and mining capabilities of EDA-DQN, three algorithms were designed, namely an algorithm without EDA parts (NEDA-DQN), an algorithm without DQN parts (EDA-NDQN), and an algorithm containing all parts (EDA-DQN), for performance comparison. It is apparent that the three algorithms are of different complexity, so that all the compared algorithms are iterated 1000 times in order to ensure fairness.

Analysis of variance of RPI values for HV and IGD as shown in fig. 6, it can be seen that the proposed EDA-DQN performed better than both the EDA and DQN fractions alone, indicating that EDA-DQN combined the exploratory capacity of the EDA fraction and the mining capacity of the DQN fraction.

5.6. Action selection strategy comparison experiment

To demonstrate the effect of DQN selection, EDA-DQN is compared to the same algorithm (EDA-RDQN) containing random action selection strategy. Analysis of variance of RPI values for HV and IGD as shown in fig. 7, it can be seen that DQN section can select appropriate actions at different scheduling points.

5.7. Algorithm performance comparison test

To verify EDA-DQN performance, three currently more competitive algorithms HMOEA/D were chosen ^[4] 、TPM ^[49] And BEG-NSGA-II ^[50] A comparison is made. Analysis of variance of RPI values for HV and IGD as shown in fig. 8, it can be seen that EDA-DQN has a greater advantage over other algorithms in solving the FJSP problem of the present disclosure.

5.8. Solver CPLEX contrast experiments

To verify the correctness of the proposed MILP model, the EDA-DQN is compared with the solver CPLEX for performance. In CPLEX optimization, a weighted target value is set as w.times.C _max TEC, weight w has a range of {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9}, and nine optimal solutions are obtained in one example. The solution time of each example in the CPLEX optimizer is 1h. Table 1 lists the results of the CPLEX solver and EDA-DQN calculations, where example 10J5M fails to obtain a viable solution at a given time. From the results of Table 1, it can be found that EDA-DQN finds a better feasible solution in both HV and IGD.

Table 1 CPLEX comparative experiment results

Experiments show that FJSP for solving constraint is provided by an EDA-DQN mixing algorithm based on knowledge, and the maximum finishing time and the total electricity charge can be optimized at the same time. Five knowledge-based FJSP properties can enhance the exploration and mining capabilities of the DQN section. The effectiveness of the initialization strategy and the MILP model is demonstrated through experiments. The comparison results prove that the EDA part and the DQN part can improve the performance of EDA-DQN, and better results can be obtained compared with other efficient algorithms.

It will be apparent to those skilled in the art that embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random access Memory (Random AccessMemory, RAM), or the like.

The foregoing description of the preferred embodiments of the present disclosure is provided only and not intended to limit the disclosure so that various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims (7)

1. A multi-objective optimized machine tool flexible shop scheduling method, characterized by comprising:

acquiring workshop machine tool constraints, including machine tool processing speed, machine tool preparation time, idle time, time-of-use electricity price and carrying time;

optimizing the maximum finishing time and the total electricity charge by using an EDA-DQN model according to the acquired workshop machine tool constraint;

obtaining an optimization result;

the EDA-DQN model optimizes the maximum finishing time and the total electricity charge, and specifically comprises the following steps:

initializing a target networkAnd an online network Q;

performing depth Q network DQN training by using an excitation function;

initializing a population by an initialization strategy; the population initialization comprises the steps of designing a first initialization strategy, a second initialization strategy and a random initialization strategy according to a scheduling vector, a machine tool allocation vector and a processing speed vector; one third of population individuals are obtained according to a first initialization strategy, one third of population individuals are obtained according to a second initialization strategy, and one third of population individuals are obtained according to a random initialization strategy;

non-dominant sorting and crowding sorting are carried out, and a better solution is obtained;

the DQN section uses 34 state features that are divided into three groups;

the first group includes state features of problem scale, taking into account the number of work pieces, the number of machine tools, the number of steps for all work pieces, and the standard deviation OP for all machine tool allocation steps _std The method comprises the steps of carrying out a first treatment on the surface of the The second group comprises state characteristics related to target values, and takes the maximum finishing time, the average value and standard deviation of finishing time of each machine tool, the total electric charge and the average value and standard deviation of electric charge on each machine tool into consideration; the third group comprises state characteristics related to the scheduling elements, and takes the machine tool processing state, idle state, preparation state, time and total electricity charge of the carrying process, and average value and standard deviation of corresponding variables into consideration;

in EDA-DQN, 9 actions are designed based on the nature of the problem;

to enhance the exploration ability of DQN, three actions are included to generate multiple solutions;

first action a ₁ The maximum finishing time and the total electricity charge can be optimized simultaneously;

second action a ₂ The working procedure processing sequence of the same machine tool on a critical path is changed, and the idle time and the power consumption of the machine tool are reduced;

third action a ₃ Machine tool assignments that can change the process on the critical path;

action a ₄ 、a ₅ And a ₆ Acting as a classical mutation operator; a, a ₄ Exchanging two processes from different workpieces on the scheduling vector, and keeping the machine tool allocation and the speed setting unchanged; analogue a ₄ ，a ₅ Randomly selecting two pairs of working procedures for exchanging; a, a ₆ Randomly selecting two working procedures, and before the working procedure of the latter processing is moved to the previous working procedure, keeping the machine tool distribution and the speed setting unchanged;

action a in order to enhance the explorability of the model ₇ 、a ₈ And a ₉ Setting a random strategy; a, a ₇ Adjusting the processing sequence of a random procedure on the scheduling vector, and keeping the machine tool allocation and the speed setting unchanged; a, a ₈ Only changing the machine tool distribution of a random process, the newly distributed machine tools can process the process; a, a ₉ Only the processing speed of a random process is changed.

2. The multi-objective optimized machine tool flexible shop scheduling method according to claim 1, wherein the number of input and output nodes in the DQN corresponds to the number of feature vectors and actions.

3. The multi-objective optimized machine tool flexible shop scheduling method according to claim 2, wherein the DQN training specifically comprises: a fully-connected neural network with 2 hidden layers is adopted, wherein the number of hidden layer nodes is 20; the excitation function from the input layer to the hidden layer is a ReLU function, and the excitation function of the output layer is a purelin function.

4. A multi-objective optimized machine tool flexible shop scheduling method according to claim 3, wherein the DQN training further comprises:

according toUpdating the parameter theta of the online network Q by utilizing gradient descent to obtain a target network +.>。

5. The multi-objective optimized machine tool flexible shop scheduling method according to claim 4, wherein the non-dominant and crowding degree ranks, for two solutions x and y, if (C _max,x ≤C _max,y )∧(TEC _x <TEC _y )

Or (b)

(C _max,x <C _max,y )∧(TEC _x ≤TEC _y ) X dominates y;

otherwise, x and y do not govern each other.

6. A medium having stored thereon a program, which when executed by a processor, implements the steps of the multi-objective optimized machine tool flexible shop scheduling method according to any one of claims 1-5.

7. An apparatus comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor performs the steps in the multi-objective optimized machine tool flexible shop scheduling method according to any one of claims 1-5 when the program is executed.

CN202110986700.2A 2021-08-26 2021-08-26 Multi-objective optimized machine tool flexible workshop scheduling method and system Active CN113759841B (en)

Priority Applications (1)

Application Number	Priority Date	Filing Date	Title
CN202110986700.2A CN113759841B (en)	2021-08-26	2021-08-26	Multi-objective optimized machine tool flexible workshop scheduling method and system

Applications Claiming Priority (1)

Application Number	Priority Date	Filing Date	Title
CN202110986700.2A CN113759841B (en)	2021-08-26	2021-08-26	Multi-objective optimized machine tool flexible workshop scheduling method and system

Publications (2)

Publication Number	Publication Date
CN113759841A CN113759841A (en)	2021-12-07
CN113759841B true CN113759841B (en)	2024-01-12

Family

ID=78791348

Family Applications (1)

Application Number	Title	Priority Date	Filing Date
CN202110986700.2A Active CN113759841B (en)	2021-08-26	2021-08-26	Multi-objective optimized machine tool flexible workshop scheduling method and system