CN113759841B - Multi-objective optimized machine tool flexible workshop scheduling method and system - Google Patents
- ️Fri Jan 12 2024
Info
-
Publication number
- CN113759841B CN113759841B CN202110986700.2A CN202110986700A CN113759841B CN 113759841 B CN113759841 B CN 113759841B CN 202110986700 A CN202110986700 A CN 202110986700A CN 113759841 B CN113759841 B CN 113759841B Authority
- CN
- China Prior art keywords
- machine tool
- dqn
- eda
- scheduling method
- time Prior art date
- 2021-08-26 Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B19/00—Programme-control systems
- G05B19/02—Programme-control systems electric
- G05B19/418—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM]
- G05B19/41885—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM] characterised by modeling, simulation of the manufacturing system
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
- G05B2219/32—Operator till task planning
- G05B2219/32339—Object oriented modeling, design, analysis, implementation, simulation language
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/02—Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]
Landscapes
- Engineering & Computer Science (AREA)
- Manufacturing & Machinery (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- General Factory Administration (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present disclosure provides a multi-objective optimized machine tool flexible workshop scheduling method, including initializing a target networkAnd an online network Q; performing depth Q network DQN training by using an excitation function; initializing a population by an initialization strategy; and performing non-dominant ranking and crowding ranking and obtaining a better solution. The present disclosure exploits the advantages of EAs and DRL, proposing a hybrid algorithm EDA-DQN for FJSP based on knowledge distribution estimation algorithms (EDA) and DQN. The maximum finishing time and the total electricity charge are optimized by using a Pareto-based method. In EDA-DQN, the DQN part is a local search selector, which is responsible for selecting proper local search strategies under different scheduling states; the EDA portion is used to improve the exploratory capacity of the algorithm and the DQN portion is used to improve the mining capacity of the algorithm.
Description
Technical Field
The disclosure relates to the technical field of workshop scheduling, in particular to a multi-objective optimized machine tool flexible workshop scheduling method and system.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
The flexible workshop scheduling problem (FJSP) has been widely studied and applied in the fields of semiconductor manufacturing, chemical processes, cell phone assembly, etc. In practical production, the influence of time-of-use electricity prices (TOUEP) should be considered in economic and green development, and therefore, the electricity is preferably used in a low electricity price region more appropriately. Further, the processing speed setting of the machine tool should be balanced between the production plan and the profit of the enterprise, the process with high power consumption should be arranged in the low power price region, and the process with low power consumption can be properly arranged in the high power price region.
The current prior art solves the FJSP of various constraints, but has the problem that performance in large scale problems is drastically reduced. To solve this problem, it is considered to enhance the optimization capability for FJSP with Reinforcement Learning (RL) which is currently considered to have a strong intelligent optimization capability. Recently, deep Reinforcement Learning (DRL) combining RL and deep learning is increasingly applied to solve the Combined Optimization Problem (COPs), showing the powerful ability of DRL to solve COPs.
Several studies have been performed to optimize the parameters of EAs using RL. Emary et al optimized GWO parameters using RL and neural networks. Cao et al propose a cuckoo search algorithm with RL and proxy models to solve the scheduling problem of semiconductor final testing, where RL is used to ensure diversity of populations and search strength; although RL has been applied to enhance optimization by adjusting the parameter values of EAs, the intelligent benefits of RL are not directly involved in the optimization process.
Recently, more research has directed RL as the strategy for optimization. The Lin et al scholars utilize a multi-stage DQN to solve the shop scheduling problem, optimizing the maximum finishing time therein. Park et al propose using the setup-switch scheduling theory of RL to minimize the maximum finishing time in chip production. Luo designed the DQN algorithm for dynamic FJSP provides a basic algorithm for RL solution to FJSP. Hu et al have solved the flexible workshop production problem with the vehicle automatic navigation scheduling method of taking the mixing rule, have optimized maximum time of finishing and delay rate. He et al solve the problem of scheduling agile satellites using DQN models. Park et al scholars combine graph neural networks with the RL to solve the shop scheduling problem, where the RL-based near-end policy optimization is used to train the model. Zhao et al propose a RL-based collaborative water wave optimization algorithm to solve the problem of distributed assembly wait-free flow shop scheduling. Han and Yang constructed an end-to-end DRL framework to solve FJSP, where an improved pointer network was employed to encode the procedure, and a convolutional neural network was used as the decoding network. Kim and Lee propose Petri networks as an environment in the RL to solve the flow shop scheduling problem, optimizing the maximum completion time. Xu and other scholars design a differential evolution algorithm based on RL to solve the problem of multi-stage energy consumption scheduling of the industrial integrated energy consumption system. In the above studies COPs was solved by various methods of RL, the intelligence of RL was exploited, however, most studies only optimized the single-objective problem and the RL method for the multi-objective problem needs further investigation. In addition, the advantages of EAs and RL need to be combined to better improve the optimization performance of the algorithm.
Disclosure of Invention
In order to solve the defects of the prior art, the present disclosure provides a multi-objective optimized machine tool flexible workshop scheduling method and system, and provides a mixed algorithm EDA-DQN of a knowledge-based distribution estimation algorithm (EDA) and DQN for FJSP by utilizing the advantages of EAs and DRL. The maximum finishing time and the total electricity charge are optimized by using a Pareto-based method. In EDA-DQN, the DQN part is a local search selector, which is responsible for selecting proper local search strategies under different scheduling states; the EDA portion is used to improve the exploratory capacity of the algorithm and the DQN portion is used to improve the mining capacity of the algorithm.
In order to achieve the above purpose, the present disclosure adopts the following technical scheme:
the first aspect of the present disclosure provides a multi-objective optimized machine tool flexible shop scheduling method.
A multi-objective optimized machine tool flexible workshop scheduling method comprises the following steps:
acquiring workshop machine tool constraints, including machine tool processing speed, machine tool preparation time, idle time, time-of-use electricity price and carrying time;
optimizing the maximum finishing time and the total electricity charge by using an EDA-DQN model;
obtaining an optimization result;
the EDA-DQN model optimizes the maximum finishing time and the total electricity charge, and specifically comprises the following steps:
initializing a target networkAnd an online network Q;
performing depth Q network DQN training by using an excitation function;
initializing a population by an initialization strategy;
and performing non-dominant ranking and crowding ranking and obtaining a better solution.
Further, the population initialization includes designing a first initialization strategy, a second initialization strategy, and a random initialization strategy based on a Scheduling Vector (SV), a machine tool allocation vector (MAV), and a Process Speed Vector (PSV).
Further, the population is initialized, one third of the population individuals are obtained according to a first initialization strategy, one third of the population individuals are obtained according to a second initialization strategy, and one third of the population individuals are obtained according to a random initialization strategy.
Further, in the DQN, the number of input and output nodes coincides with the number of feature vectors and actions.
Further, the DQN training specifically includes: a fully-connected neural network with 2 hidden layers is adopted, wherein the number of hidden layer nodes is 20; the excitation function from the input layer to the hidden layer is a ReLU function, and the excitation function of the output layer is a purelin function.
Further, the DQN training further comprises:
according toUpdating the parameter theta of the online network Q by utilizing gradient descent to obtain a target network +.>
Further, the non-dominant ranking and congestion degree ranking, for both solutions x and y, if (C max,x ≤C max,y )∧(TEC x <TEC y )
Or (b)
(C max,x <C max,y )∧(TEC x ≤TEC y ) X dominates y;
otherwise, x and y do not govern each other.
A second aspect of the present disclosure provides a coordinated optimization operating system for an electric heating gas interconnection system.
A multi-objective optimized machine tool flexible shop scheduling system, comprising:
an initialization module configured to: initializing a target network and an online network;
a network training module configured to: performing depth Q network DQN training by using an excitation function;
an initialization policy module configured to: initializing a population by an initialization strategy;
a ranking module configured to: non-dominant ranking and congestion degree ranking.
A third aspect of the present disclosure provides a medium having stored thereon a program which when executed by a processor implements the steps in a multi-objective optimized machine tool flexible shop scheduling method according to the first aspect of the present disclosure.
A fourth aspect of the present disclosure provides an apparatus comprising a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing steps in a multi-objective optimized machine tool flexible shop scheduling method according to the first aspect of the present disclosure when the program is executed.
Compared with the prior art, the beneficial effects of the present disclosure are:
the present disclosure exploits the advantages of EAs and DRL, proposing a hybrid algorithm EDA-DQN for FJSP based on knowledge distribution estimation algorithms (EDA) and DQN. The maximum finishing time and the total electricity charge are optimized by using a Pareto-based method. In EDA-DQN, the DQN part is a local search selector, which is responsible for selecting proper local search strategies under different scheduling states; the EDA part is used for improving the exploratory capacity of the algorithm, and the DQN part is used for improving the mining capacity of the algorithm;
the present disclosure contemplates FJSP including constraints of variable processing speed, setup time, idle time, workpiece handling, time-of-use electricity prices, etc.; optimizing the maximum finishing time and the total electricity charge by using a mixed algorithm of EDA and DQN; 34 state features and 9 actions are designed in the DQN section to enhance the mining ability of the algorithm, and the problem-based EDA section enhances the exploration ability of the algorithm.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate and explain the exemplary embodiments of the disclosure and together with the description serve to explain the disclosure, and do not constitute an undue limitation on the disclosure.
Fig. 1 is a schematic flow chart of an EDA-DQN model provided in example 1 of the present disclosure.
Fig. 2 is a coding diagram of a solution provided in embodiment 1 of the present disclosure.
Fig. 3 is a state feature diagram considered by the EDA-DQN model provided in example 1 of the present disclosure.
Fig. 4 is a graph of the result of the parameter correction experiment provided in example 1 of the present disclosure.
Fig. 5 is a diagram of comparison results of an initialization strategy provided in embodiment 1 of the present disclosure.
Fig. 6 is a graph of the exploration and excavation ability versus results provided by example 1 of the present disclosure.
Fig. 7 is a diagram of comparison results of action selection policies provided in embodiment 1 of the present disclosure.
Fig. 8 is a graph of algorithm performance versus results provided in example 1 of the present disclosure.
Detailed Description
The disclosure is further described below with reference to the drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the present disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments in accordance with the present disclosure. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
Embodiments of the present disclosure and features of embodiments may be combined with each other without conflict.
Example 1:
as shown in fig. 1, embodiment 1 of the present disclosure provides a multi-objective optimized machine tool flexible shop scheduling method, including obtaining shop machine tool constraints including machine tool processing speed, machine tool preparation time, idle time, time-of-use electricity price, and handling time;
optimizing the maximum finishing time and the total electricity charge by using an EDA-DQN model;
obtaining an optimization result;
the EDA-DQN model optimizes the maximum finishing time and the total electricity charge, and specifically comprises the following steps:
initializing a target networkAnd an online network Q;
performing depth Q network DQN training by using an excitation function;
initializing a population by an initialization strategy;
and performing non-dominant ranking and crowding ranking and obtaining a better solution.
The method comprises the following steps:
1. encoding and decoding
FJSP encoding contemplated by the present disclosure is designed as a three-dimensional encoding strategy. First, the processing order of all the processes is controlled by a Scheduling Vector (SV), where each element is the workpiece number of the corresponding process, and the j-th occurring workpiece number represents the j-th process of the workpiece. Then, machine tool machining of all the processes is controlled by a machine tool allocation vector (MAV), wherein the machine tool number of each process corresponds to a process in SV one by one. Finally, the process speeds of all the processes are controlled by a Process Speed Vector (PSV), wherein the number of process speed steps for each process corresponds one-to-one to the position in the MAV. Figure 2 shows the encoding of a3 workpiece-3 machine FJSP solution.
In the decoding process, the processing sequence of all the processes is calculated according to the SV in sequence, and the start and end time of all the processes are determined by the same workpiece preamble process, machine tool preparation time, machine tool idle time and conveyance time. In the target value calculation, the maximum completion time is determined by the end time of the last completed process, and the total electric charge is the sum of all electric charge consumptions under the TOUEP setting.
2. Initializing a population:
reasonable initialization strategies can provide a good environment for exploration and mining capabilities. The present disclosure designs three different initialization strategies based on SV, MAV and PSV. In order to obtain a diversified initialized population, one third of population individuals are obtained according to an initialization strategy 1, one third of population individuals are obtained according to an initialization strategy 2, and the rest population individuals are obtained according to a random initialization strategy. In all initialization strategies, all procedures are distributed to machine tools capable of being processed, and the legitimacy of knowledge is guaranteed.
For the initialization strategy 1, the time and power consumption from the handling, machine preparation and idle processes are reduced by this strategy, according to the nature 1, the next determined procedure in sv from the last scheduled work piece that is not completed. The corresponding machine tool is the same as the dispensing machine tool of the preceding process. In order to reduce the total power consumption, a lower instructor speed tends to be selected, from which all of the process speed steps in the PSV are randomly selected.
The initialization strategy 2 is identical to the initialization strategy 1 except for the decision part of the PSV. To reduce the maximum finishing time, all process speed steps in the PSV of initialization strategy 2 are randomly selected from.
3. Distribution estimation algorithm
Corresponding to SV, MAV and PSV in the code, the probability matrix of FJSP in the present disclosure is A mi 、B ijk And C ijs . The process of initializing, updating and generating solutions for the probability matrices described above is described below.
3.1 initialization and updating of probability matrices
A mi Probability of being set to the mth position of the SV for the process of the workpiece i. A is that mi Initialization is performed according to equation (47). A is that mi The update rule at one iteration is of formula (48), where I mi Defined by equation (49), SI is the number of all excellent solutions, and the update rate α is 0.5. SI was half the population in this study.
A mi (Gen=0)=1/I
B ijk Is O i,j Probability of machining by machine tool k. B (B) ijk Initialization is performed according to equation (50). B (B) ijk The update rule at one iteration is of formula (51), wherein I ijk The update rate β is 0.5, defined by equation (52).
C ijs Is O i,j Probability of performing the machining at the s-th speed level. C (C) ijs Initialization is performed according to equation (53). C (C) ijs The update rule at one iteration is of formula (54), where I ijs The update rate γ is 0.5, which is defined by equation (55).
C ijs (Gen=0)=1/S
3.2 Generation of solutions
The SV, MAV and PSV of the solution can be based onProbability matrix A mi 、B ijk And C ijs Sequentially generating. With respect to A mi If all the working procedures of one workpiece are finished, the probability of the subsequent workpiece processing is 0; with respect to B ijk If machine tool k cannot process O i,j Then B is ijk Is always 0; with respect to C ijs The speed distribution always obeys the probability distribution.
4. Deep reinforcement learning
4.1 State characterization
The state characterization knows the state at the point of decision, and therefore the reality of the state characterization should reflect the effect of the action, helping the model to select the action.
The DQN section of the present disclosure uses 34 state features that are divided into three groups. The first group includes state features of problem scale, taking into account the number of work pieces, the number of machine tools, the number of steps for all work pieces, and the standard deviation OP for all machine tool allocation steps std Etc. The second group includes state characteristics associated with the target values, taking into account the maximum finishing time, the average and standard deviation of finishing times of the respective machine tools, the total electric charge, and the average and standard deviation of electric charges on the respective machine tools. The third group includes state features associated with the schedule element, taking into account machine tool processing state, idle state, ready state, and time and total electricity charge of the handling process, as well as the average and standard deviation of the corresponding variables. All status features are shown in fig. 3.
4.2. Action
In EDA-DQN, the present study designed 9 actions based on the nature of the problem, so actions can effectively improve the quality of the resulting solution. To enhance the exploration ability of DQN, three actions are included to generate multiple solutions.
First action a 1 Designed according to property 2 and property 3, the maximum finishing time and the total electricity charge can be optimized simultaneously. Algorithm 1 describes a 1 Is a process of (2).
According to the natureMass 4, second action a 2 The working procedure and processing sequence of the same machine tool on a critical path is changed, and the idle time and the power consumption of the machine tool are reduced. Algorithm 2 describes a 2 Is a process of (2).
The third action a3 is designed according to property 4 and property 5, and can change the machine tool distribution of the working procedures on the critical path. Algorithm 3 describes a 3 Is a process of (2).
Action a 4 、a 5 And a 6 Is a classical mutation operator action. a, a 4 The process steps from two different workpieces on the SV are exchanged, and the machine tool allocation and speed setting remain unchanged. Analogue a 4 ,a 5 Two pairs of procedures are randomly selected for exchange. a, a 6 Two processes are randomly selected, and the machine tool allocation and speed setting remain unchanged until the process of the latter process is moved to the previous process.
Action a in order to enhance the explorability of the model 7 、a 8 And a 9 Set to a random strategy. a, a 7 The machine tool allocation and speed setting were kept unchanged by adjusting the processing sequence of a random sequence of steps on SV. a, a 8 Only the allocation of machine tools for a random process is changed, and the newly allocated machine tools can process the process. a, a 9 Only the processing speed of a random process is changed.
4.3. Rewards and rewards
At decision point t, rewards for the DQN portion are defined as in algorithm 4.
5. DQN structure and training procedure
In DQN, the number of input and output nodes corresponds to the number of feature vectors and actions. A fully connected neural network with hidden layer 2 is used, with hidden layer nodes number 20. The excitation function from the input layer to the hidden layer is a ReLU function, and the excitation function of the output layer is a purelin function. The training process for DQN is described as algorithm 5:
6. non-dominant ranking and congestion degree ranking
For both solutions x and y, x dominates y if (Cmax, x+.Cmax, y)/(TECx < TECy) or (Cmax, x < Cmax, y)/(TECx+.TECy); otherwise, x and y do not govern each other.
Example 2:
embodiment 2 of the present disclosure provides a multi-objective optimized machine tool flexible shop scheduling system, comprising:
an initialization module configured to: initializing a target network and an online network;
a network training module configured to: performing depth Q network DQN training by using an excitation function;
an initialization policy module configured to: initializing a population by an initialization strategy;
a ranking module configured to: non-dominant ranking and congestion degree ranking.
The working method of the system is the same as the coordinated and optimized operation method of the electric heating gas interconnection system provided in embodiment 1, and is not repeated here.
Example 3:
embodiment 3 of the present disclosure provides a medium having stored thereon a program which, when executed by a processor, implements steps in a multi-objective optimized machine tool flexible shop scheduling method as described in embodiment 1 of the present disclosure.
The detailed steps are the same as those of the electric heating gas interconnection system provided in embodiment 1, and are not repeated here.
Example 4:
embodiment 4 of the present disclosure provides an apparatus including a memory, a processor, and a program stored on the memory and executable on the processor, which when executed implements steps in a multi-objective optimized machine tool flexible shop scheduling method according to embodiment 1 of the present disclosure.
The detailed steps are the same as those of the electric heating gas interconnection system provided in embodiment 1, and are not repeated here.
Example 5 experimental analysis
5.1. Practical calculation example
The present disclosure obtains research examples by means of random generation, and the examples are divided into three groups. The first set includes 30 examples, where the number of tools {10,20,30,40,50,60,80,100,150,200}, the number of machine tools {5,8,10}. The second class includes 12 examples, where the number of tools {4,6,8,10}, the number of machine tools {2,3,5}. The third class of examples is the first class of example set, comprising 10 examples, where the number of machine tools is 8. The example "10J5M" indicates that the example includes 10 workpieces and 5 machine tools.
5.2. Evaluation criteria
The present disclosure utilizes hyper-volume (HV) and reverse generation distance (IGD) to compare the performance of the proposed algorithm with other algorithms.
In the calculation of HV, the two targets are normalized according to equation (56). Where f' is the ith normalized target value, f i For the ith target value, f i,max Representing the maximum value of the ith target value.
f i ′=f i /f i,max
The reference point for HV calculation is set to (3, 3) and the HV value is calculated as follows.
Where λ (·) is the lebeger measure, NS represents the non-dominant solution set, v i Representing non-dominant solution pi with reference pointHV value between.
The IGD calculation formula is as follows.
Where PF is the true pareto front, |j-i| is the euclidean distance between the j solution on PF to the i solution on NS.
The Relative Percentage Increment (RPI) is used to compare standard values for different algorithms, and the calculation formula is as follows.
In the above formula, f c Calculating the average number of the standard values for the algorithm, f min Is the optimal value of the average standard value of all algorithms.
5.3. Parameter correction
The EDA-DQN model is most affected by the action-selected softmax strategy parameter μ. Thus, the present disclosure corrects only the parameter μ. A larger μ can select actions with a larger Q value, but tend to fall into local optima; smaller μ favors random selection actions but loses knowledge-based policy information. Fig. 4 shows the trend of HV and IGD at values of μ from 1.1 to 2.0, and it can be seen from fig. 4 that both HV and IGD achieve the best optimization when μ is 1.5. The study therefore selected a μ value of 1.5 in the following experimental study.
5.4. Initializing policy comparison experiments
To verify the effectiveness of the initialization strategy in this disclosure, the algorithm of the initialization strategy is compared to the algorithm of the random initialization strategy (R-init). Analysis of variance of RPI values for HV and IGD as shown in fig. 5, it can be seen that DQN-EDA with initialization strategy is better than random initialization.
5.5. Comparative experiments on exploration and excavation capacities
To compare the exploration and mining capabilities of EDA-DQN, three algorithms were designed, namely an algorithm without EDA parts (NEDA-DQN), an algorithm without DQN parts (EDA-NDQN), and an algorithm containing all parts (EDA-DQN), for performance comparison. It is apparent that the three algorithms are of different complexity, so that all the compared algorithms are iterated 1000 times in order to ensure fairness.
Analysis of variance of RPI values for HV and IGD as shown in fig. 6, it can be seen that the proposed EDA-DQN performed better than both the EDA and DQN fractions alone, indicating that EDA-DQN combined the exploratory capacity of the EDA fraction and the mining capacity of the DQN fraction.
5.6. Action selection strategy comparison experiment
To demonstrate the effect of DQN selection, EDA-DQN is compared to the same algorithm (EDA-RDQN) containing random action selection strategy. Analysis of variance of RPI values for HV and IGD as shown in fig. 7, it can be seen that DQN section can select appropriate actions at different scheduling points.
5.7. Algorithm performance comparison test
To verify EDA-DQN performance, three currently more competitive algorithms HMOEA/D were chosen [4] 、TPM [49] And BEG-NSGA-II [50] A comparison is made. Analysis of variance of RPI values for HV and IGD as shown in fig. 8, it can be seen that EDA-DQN has a greater advantage over other algorithms in solving the FJSP problem of the present disclosure.
5.8. Solver CPLEX contrast experiments
To verify the correctness of the proposed MILP model, the EDA-DQN is compared with the solver CPLEX for performance. In CPLEX optimization, a weighted target value is set as w.times.C max TEC, weight w has a range of {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9}, and nine optimal solutions are obtained in one example. The solution time of each example in the CPLEX optimizer is 1h. Table 1 lists the results of the CPLEX solver and EDA-DQN calculations, where example 10J5M fails to obtain a viable solution at a given time. From the results of Table 1, it can be found that EDA-DQN finds a better feasible solution in both HV and IGD.
Table 1 CPLEX comparative experiment results
Experiments show that FJSP for solving constraint is provided by an EDA-DQN mixing algorithm based on knowledge, and the maximum finishing time and the total electricity charge can be optimized at the same time. Five knowledge-based FJSP properties can enhance the exploration and mining capabilities of the DQN section. The effectiveness of the initialization strategy and the MILP model is demonstrated through experiments. The comparison results prove that the EDA part and the DQN part can improve the performance of EDA-DQN, and better results can be obtained compared with other efficient algorithms.
It will be apparent to those skilled in the art that embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random access Memory (Random AccessMemory, RAM), or the like.
The foregoing description of the preferred embodiments of the present disclosure is provided only and not intended to limit the disclosure so that various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Claims (7)
1. A multi-objective optimized machine tool flexible shop scheduling method, characterized by comprising:
acquiring workshop machine tool constraints, including machine tool processing speed, machine tool preparation time, idle time, time-of-use electricity price and carrying time;
optimizing the maximum finishing time and the total electricity charge by using an EDA-DQN model according to the acquired workshop machine tool constraint;
obtaining an optimization result;
the EDA-DQN model optimizes the maximum finishing time and the total electricity charge, and specifically comprises the following steps:
initializing a target networkAnd an online network Q;
performing depth Q network DQN training by using an excitation function;
initializing a population by an initialization strategy; the population initialization comprises the steps of designing a first initialization strategy, a second initialization strategy and a random initialization strategy according to a scheduling vector, a machine tool allocation vector and a processing speed vector; one third of population individuals are obtained according to a first initialization strategy, one third of population individuals are obtained according to a second initialization strategy, and one third of population individuals are obtained according to a random initialization strategy;
non-dominant sorting and crowding sorting are carried out, and a better solution is obtained;
the DQN section uses 34 state features that are divided into three groups;
the first group includes state features of problem scale, taking into account the number of work pieces, the number of machine tools, the number of steps for all work pieces, and the standard deviation OP for all machine tool allocation steps std The method comprises the steps of carrying out a first treatment on the surface of the The second group comprises state characteristics related to target values, and takes the maximum finishing time, the average value and standard deviation of finishing time of each machine tool, the total electric charge and the average value and standard deviation of electric charge on each machine tool into consideration; the third group comprises state characteristics related to the scheduling elements, and takes the machine tool processing state, idle state, preparation state, time and total electricity charge of the carrying process, and average value and standard deviation of corresponding variables into consideration;
in EDA-DQN, 9 actions are designed based on the nature of the problem;
to enhance the exploration ability of DQN, three actions are included to generate multiple solutions;
first action a 1 The maximum finishing time and the total electricity charge can be optimized simultaneously;
second action a 2 The working procedure processing sequence of the same machine tool on a critical path is changed, and the idle time and the power consumption of the machine tool are reduced;
third action a 3 Machine tool assignments that can change the process on the critical path;
action a 4 、a 5 And a 6 Acting as a classical mutation operator; a, a 4 Exchanging two processes from different workpieces on the scheduling vector, and keeping the machine tool allocation and the speed setting unchanged; analogue a 4 ,a 5 Randomly selecting two pairs of working procedures for exchanging; a, a 6 Randomly selecting two working procedures, and before the working procedure of the latter processing is moved to the previous working procedure, keeping the machine tool distribution and the speed setting unchanged;
action a in order to enhance the explorability of the model 7 、a 8 And a 9 Setting a random strategy; a, a 7 Adjusting the processing sequence of a random procedure on the scheduling vector, and keeping the machine tool allocation and the speed setting unchanged; a, a 8 Only changing the machine tool distribution of a random process, the newly distributed machine tools can process the process; a, a 9 Only the processing speed of a random process is changed.
2. The multi-objective optimized machine tool flexible shop scheduling method according to claim 1, wherein the number of input and output nodes in the DQN corresponds to the number of feature vectors and actions.
3. The multi-objective optimized machine tool flexible shop scheduling method according to claim 2, wherein the DQN training specifically comprises: a fully-connected neural network with 2 hidden layers is adopted, wherein the number of hidden layer nodes is 20; the excitation function from the input layer to the hidden layer is a ReLU function, and the excitation function of the output layer is a purelin function.
4. A multi-objective optimized machine tool flexible shop scheduling method according to claim 3, wherein the DQN training further comprises:
according toUpdating the parameter theta of the online network Q by utilizing gradient descent to obtain a target network +.>。
5. The multi-objective optimized machine tool flexible shop scheduling method according to claim 4, wherein the non-dominant and crowding degree ranks, for two solutions x and y, if (C max,x ≤C max,y )∧(TEC x <TEC y )
Or (b)
(C max,x <C max,y )∧(TEC x ≤TEC y ) X dominates y;
otherwise, x and y do not govern each other.
6. A medium having stored thereon a program, which when executed by a processor, implements the steps of the multi-objective optimized machine tool flexible shop scheduling method according to any one of claims 1-5.
7. An apparatus comprising a memory, a processor and a program stored on the memory and executable on the processor, wherein the processor performs the steps in the multi-objective optimized machine tool flexible shop scheduling method according to any one of claims 1-5 when the program is executed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110986700.2A CN113759841B (en) | 2021-08-26 | 2021-08-26 | Multi-objective optimized machine tool flexible workshop scheduling method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110986700.2A CN113759841B (en) | 2021-08-26 | 2021-08-26 | Multi-objective optimized machine tool flexible workshop scheduling method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113759841A CN113759841A (en) | 2021-12-07 |
CN113759841B true CN113759841B (en) | 2024-01-12 |
Family
ID=78791348
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110986700.2A Active CN113759841B (en) | 2021-08-26 | 2021-08-26 | Multi-objective optimized machine tool flexible workshop scheduling method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113759841B (en) |
Families Citing this family (1)
* Cited by examiner, † Cited by third partyPublication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117933684B (en) * | 2024-01-17 | 2024-09-03 | 深圳市链宇技术有限公司 | Workshop scheduling method considering raw material alignment constraint and multi-machine parallel processing |
Citations (10)
* Cited by examiner, † Cited by third partyPublication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103390195A (en) * | 2013-05-28 | 2013-11-13 | 重庆大学 | Machine workshop task scheduling energy-saving optimization system based on reinforcement learning |
CN109270904A (en) * | 2018-10-22 | 2019-01-25 | 中车青岛四方机车车辆股份有限公司 | A kind of flexible job shop batch dynamic dispatching optimization method |
CN110781614A (en) * | 2019-12-06 | 2020-02-11 | 北京工业大学 | Shipboard aircraft tripping recovery online scheduling method based on deep reinforcement learning |
CN111199272A (en) * | 2019-12-30 | 2020-05-26 | 同济大学 | Adaptive scheduling method for intelligent workshop |
CN111352502A (en) * | 2018-12-20 | 2020-06-30 | 三星电子株式会社 | Bioresponsive virtual reality system and method of operating the same |
CN111369181A (en) * | 2020-06-01 | 2020-07-03 | 北京全路通信信号研究设计院集团有限公司 | Train autonomous scheduling deep reinforcement learning method and module |
CN112286149A (en) * | 2020-10-15 | 2021-01-29 | 山东师范大学 | Flexible workshop scheduling optimization method and system considering crane transportation process |
CN112734172A (en) * | 2020-12-25 | 2021-04-30 | 南京理工大学 | Hybrid flow shop scheduling method based on time sequence difference |
CN112884239A (en) * | 2021-03-12 | 2021-06-01 | 重庆大学 | Aerospace detonator production scheduling method based on deep reinforcement learning |
CN113157202A (en) * | 2020-01-22 | 2021-07-23 | 三星电子株式会社 | Memory controller, memory device including the same, and method of operating the same |
Family Cites Families (1)
* Cited by examiner, † Cited by third partyPublication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11170293B2 (en) * | 2015-12-30 | 2021-11-09 | Microsoft Technology Licensing, Llc | Multi-model controller |
-
2021
- 2021-08-26 CN CN202110986700.2A patent/CN113759841B/en active Active
Patent Citations (10)
* Cited by examiner, † Cited by third partyPublication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103390195A (en) * | 2013-05-28 | 2013-11-13 | 重庆大学 | Machine workshop task scheduling energy-saving optimization system based on reinforcement learning |
CN109270904A (en) * | 2018-10-22 | 2019-01-25 | 中车青岛四方机车车辆股份有限公司 | A kind of flexible job shop batch dynamic dispatching optimization method |
CN111352502A (en) * | 2018-12-20 | 2020-06-30 | 三星电子株式会社 | Bioresponsive virtual reality system and method of operating the same |
CN110781614A (en) * | 2019-12-06 | 2020-02-11 | 北京工业大学 | Shipboard aircraft tripping recovery online scheduling method based on deep reinforcement learning |
CN111199272A (en) * | 2019-12-30 | 2020-05-26 | 同济大学 | Adaptive scheduling method for intelligent workshop |
CN113157202A (en) * | 2020-01-22 | 2021-07-23 | 三星电子株式会社 | Memory controller, memory device including the same, and method of operating the same |
CN111369181A (en) * | 2020-06-01 | 2020-07-03 | 北京全路通信信号研究设计院集团有限公司 | Train autonomous scheduling deep reinforcement learning method and module |
CN112286149A (en) * | 2020-10-15 | 2021-01-29 | 山东师范大学 | Flexible workshop scheduling optimization method and system considering crane transportation process |
CN112734172A (en) * | 2020-12-25 | 2021-04-30 | 南京理工大学 | Hybrid flow shop scheduling method based on time sequence difference |
CN112884239A (en) * | 2021-03-12 | 2021-06-01 | 重庆大学 | Aerospace detonator production scheduling method based on deep reinforcement learning |
Non-Patent Citations (3)
* Cited by examiner, † Cited by third partyTitle |
---|
Multi-objective optimization based on decomposition for flexible job shop scheduling under time-of-use electricity prices;En-da Jiang等;《Knowledge-Based Systems》;20200927;第2-10页 * |
基于人工智能算法的机器人路径规划研究;任群;;遵义师范学院学报(01);全文 * |
求解混合流水车间调度的多目标优化算法.计算机工程与设计.2018,全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN113759841A (en) | 2021-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Du et al. | 2022 | Knowledge-based reinforcement learning and estimation of distribution algorithm for flexible job shop scheduling problem |
Liu et al. | 2021 | Many-objective job-shop scheduling: A multiple populations for multiple objectives-based genetic algorithm approach |
Feng et al. | 2018 | Target disassembly sequencing and scheme evaluation for CNC machine tools using improved multiobjective ant colony algorithm and fuzzy integral |
Zhang et al. | 2015 | An object-coding genetic algorithm for integrated process planning and scheduling |
Xu et al. | 2022 | Optimization approaches for solving production scheduling problem: A brief overview and a case study for hybrid flow shop using genetic algorithms |
CN106779372A (en) | 2017-05-31 | Based on the agricultural machinery dispatching method for improving immune Tabu search algorithm |
CN111985672B (en) | 2021-08-27 | Single-piece job shop scheduling method for multi-Agent deep reinforcement learning |
CN109271320B (en) | 2021-09-24 | A method for prioritization of upper-level multi-objective test cases |
Bhatt et al. | 2015 | Genetic algorithm applications on job shop scheduling problem: A review |
CN115130789A (en) | 2022-09-30 | A Distributed Manufacturing Intelligent Scheduling Method Based on Improved Grey Wolf Optimization Algorithm |
CN105975701A (en) | 2016-09-28 | Parallel scheduling disassembly path forming method based on mixing fuzzy model |
Yao et al. | 2024 | A DQN-based memetic algorithm for energy-efficient job shop scheduling problem with integrated limited AGVs |
CN113792494B (en) | 2023-11-17 | Multi-objective flexible job shop scheduling method based on migratory bird flock algorithm and cross fusion |
CN113759841B (en) | 2024-01-12 | Multi-objective optimized machine tool flexible workshop scheduling method and system |
CN114004065A (en) | 2022-02-01 | Multi-objective optimization method of substation engineering based on intelligent algorithm and environmental constraints |
Cao et al. | 2021 | An adaptive multi-strategy artificial bee colony algorithm for integrated process planning and scheduling |
Halim et al. | 2015 | A novel application of genetic algorithm for synthesizing optimal water reuse network with multiple objectives |
CN113570112A (en) | 2021-10-29 | Optimization algorithm for solving cooperative vehicle path problem with time window |
Yu et al. | 2024 | Exact and deep Q-network assisted swarm intelligence methods for scheduling multi-objective heterogeneous unmanned surface vehicles |
Wang et al. | 2024 | A feedback learning-based memetic algorithm for energy-aware distributed flexible job-shop scheduling with transportation constraints |
CN115496322A (en) | 2022-12-20 | Distributed flow shop scheduling method and device |
Vasant | 2013 | Hybrid mesh adaptive direct search genetic algorithms and line search approaches for fuzzy optimization problems in production planning |
CN110989538B (en) | 2021-06-08 | Closed-loop scheduling optimization method for complex production process |
Wang et al. | 2020 | A tailored NSGA-III for multi-objective flexible job shop scheduling |
Xu et al. | 2023 | Knowledge transfer-based multifactorial evolutionary algorithm for selective maintenance optimization of multistate complex systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
2021-12-07 | PB01 | Publication | |
2021-12-07 | PB01 | Publication | |
2021-12-24 | SE01 | Entry into force of request for substantive examination | |
2021-12-24 | SE01 | Entry into force of request for substantive examination | |
2024-01-12 | GR01 | Patent grant | |
2024-01-12 | GR01 | Patent grant |