patents.google.com

CN108804568B - Method and device for storing copy data in Openstack in ceph - Google Patents

  • ️Fri Jul 09 2021

CN108804568B - Method and device for storing copy data in Openstack in ceph - Google Patents

Method and device for storing copy data in Openstack in ceph Download PDF

Info

Publication number
CN108804568B
CN108804568B CN201810500277.9A CN201810500277A CN108804568B CN 108804568 B CN108804568 B CN 108804568B CN 201810500277 A CN201810500277 A CN 201810500277A CN 108804568 B CN108804568 B CN 108804568B Authority
CN
China
Prior art keywords
storage
domains
sub
fault
domain
Prior art date
2018-05-23
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810500277.9A
Other languages
Chinese (zh)
Other versions
CN108804568A (en
Inventor
韩庆波
马文军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
2018-05-23
Filing date
2018-05-23
Publication date
2021-07-09
2018-05-23 Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
2018-05-23 Priority to CN201810500277.9A priority Critical patent/CN108804568B/en
2018-11-13 Publication of CN108804568A publication Critical patent/CN108804568A/en
2021-07-09 Application granted granted Critical
2021-07-09 Publication of CN108804568B publication Critical patent/CN108804568B/en
Status Active legal-status Critical Current
2038-05-23 Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • G06F3/0665Virtualisation aspects at area level, e.g. provisioning of virtual or logical volumes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a method and a device for storing copy data in Openstack in ceph, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a topological structure of a storage device cluster in ceph; dividing each physical fault domain of the storage equipment cluster into a plurality of virtual fault sub-domains; aiming at to-be-stored copy data of storage services of each service Zone in Openstack, selecting a first number of virtual fault sub-domains as storage fault domains based on a first number of copy data to be stored, wherein different virtual fault sub-domains in the selected storage fault domains belong to different physical fault domains, and the virtual fault sub-domains selected aiming at the storage services of different zones are different; and respectively storing the to-be-stored duplicate data of the storage service of each Zone in the storage equipment of each storage fault domain corresponding to the storage service. When the data storage method provided by the embodiment of the invention is used for storing data, the storage service of a plurality of zones cannot be influenced when a certain storage device fails.

Description

Method and device for storing copy data in Openstack in ceph

Technical Field

The invention relates to the technical field of data management, in particular to a method and a device for storing copy data in Openstack in ceph, electronic equipment and a storage medium.

Background

Generally, Openstack (a cloud computing management platform) and ceph (a distributed file system) can be used in combination, that is, copy data required to be stored in Openstack service can be stored in a storage device in ceph.

Openstack includes multiple hosts for performing service processing, and can divide the hosts for a service to be processed, and divide the hosts for processing the same service into a host set, where the host set corresponds to a Zone (area), and may also be referred to as a service area.

When data to be stored in the Openstack service needs to be stored in the storage device in ceph, the storage need related to each Zone may be determined as a corresponding storage service, and the data to be stored of the storage service is stored in the storage device in ceph.

Data in the storage service of each Zone of Openstack is usually stored in ceph in a multi-copy form, and the same copy data is stored in different physical failure domains of the ceph, where a physical failure domain is a storage area artificially divided, and the division of the physical failure domain is to avoid that the same copy data stored in a certain storage area is affected when the certain storage area fails, so that multiple identical copy data are stored in different physical failure domains respectively.

In the prior art, the range of a divided physical failure domain may be large, and there may be data in a storage service in which a certain storage device in the physical failure domain stores multiple zones. Therefore, with the storage method in the prior art, when one storage device in a physically failed domain fails, the storage service of multiple zones may be affected.

Disclosure of Invention

Embodiments of the present invention provide a method, an apparatus, an electronic device, and a storage medium for storing copy data in an Openstack in ceph, so as to prevent storage services of multiple zones in the Openstack from being affected when a storage device in a physical failure domain fails.

The specific technical scheme is as follows:

the embodiment of the invention provides a method for storing copy data in Openstack in ceph, which comprises the following steps:

acquiring a topological structure of a storage device cluster in ceph, wherein the storage device cluster comprises a plurality of storage devices, and the topological structure represents the division condition of a physical fault domain of the storage device cluster;

dividing each physical fault domain of the storage equipment cluster into a plurality of virtual fault sub-domains;

for to-be-stored copy data of storage services of each service area Zone in an Openstack, selecting a first number of virtual fault sub-domains as storage fault domains based on a first number of copy data to be stored, wherein different virtual fault sub-domains in the selected storage fault domains belong to different physical fault domains, and the virtual fault sub-domains selected for the storage services of different zones are different;

and respectively storing the to-be-stored duplicate data of the storage service of each Zone in the storage device of each storage fault domain corresponding to the storage service.

Optionally, before selecting the first number of virtual fault sub-domains as storage fault domains, the method further includes:

dividing the obtained virtual fault sub-domains into a plurality of groups of virtual fault sub-domains, wherein different virtual fault sub-domains in each group of virtual fault sub-domains belong to different physical fault domains;

the selecting the first number of virtual fault sub-domains as storage fault domains comprises:

selecting the first number of virtual fault sub-domains as storage fault domains from a set of virtual fault sub-domains of the plurality of sets of virtual fault sub-domains.

Optionally, the method further includes:

determining storage equipment which does not store the duplicate data in each physical fault domain in the ceph as equipment to be expanded;

aiming at a certain storage service of a Zone needing capacity expansion in Openstack, adding at least one device to be expanded in each virtual fault sub-domain corresponding to the storage service of the Zone, and taking the obtained virtual fault sub-domain as a capacity expansion sub-domain, wherein the device to be expanded added in each virtual fault sub-domain and the virtual fault sub-domain belong to the same physical fault domain;

and respectively storing the copy data of the storage service of the Zone needing capacity expansion in each capacity expansion subdomain corresponding to the storage service.

Optionally, the method further includes:

dividing storage equipment which does not store copy data in each physical fault domain into service number sub-storage domains based on the service number of storage services of the Zone newly added by the Openstack;

for the to-be-stored copy data of the storage service of each newly-added Zone, based on a second quantity of the copy data to be stored, selecting the second quantity of sub-storage domains as capacity expansion storage domains, wherein different sub-storage domains in the selected capacity expansion storage domains belong to different physical fault domains, and the sub-storage domains selected for the storage services of different newly-added zones are different;

and respectively storing the to-be-stored copy data of the storage service of each newly-added Zone in the storage device of each capacity expansion storage domain corresponding to the storage service.

Optionally, before selecting the second number of sub-storage domains as the capacity expansion storage domain, the method further includes:

dividing the obtained sub-storage domains into a plurality of groups of sub-storage domains, wherein different sub-storage domains in each group of sub-storage domains belong to different physical fault domains;

the selecting the second number of sub-storage domains as a capacity expansion storage domain includes:

and selecting the second number of sub-storage domains as capacity expansion storage domains from one of the plurality of groups of sub-storage domains.

Optionally, the storage device includes a plurality of object storage units OSD, and the duplicate data is stored in the OSD in the storage device.

The embodiment of the invention provides a device for storing copy data in Openstack in ceph, which comprises:

the topology structure obtaining module is used for obtaining a topology structure of a storage device cluster in ceph, wherein the storage device cluster comprises a plurality of storage devices, and the topology structure represents the division condition of a physical fault domain of the storage device cluster;

the fault sub-domain dividing module is used for dividing each physical fault domain of the storage device cluster into a plurality of virtual fault sub-domains;

the fault sub-domain selection module is used for selecting a first number of virtual fault sub-domains as storage fault domains according to a first number of copy data to be stored of storage services of each service Zone in Openstack, wherein different virtual fault sub-domains in the selected storage fault domains belong to different physical fault domains, and the virtual fault sub-domains selected according to the storage services of different zones are different;

and the first storage module is used for respectively storing the to-be-stored duplicate data of the storage service of each Zone into the storage device of each storage fault domain corresponding to the storage service.

Optionally, the apparatus further comprises:

the first grouping module is used for dividing the obtained virtual fault sub-domains into a plurality of groups of virtual fault sub-domains, and different virtual fault sub-domains in each group of virtual fault sub-domains belong to different physical fault domains;

the fault sub-domain selection module is specifically configured to:

selecting the first number of virtual fault sub-domains as storage fault domains from a set of virtual fault sub-domains of the plurality of sets of virtual fault sub-domains.

Optionally, the apparatus further comprises:

the device to be expanded determining module is used for determining the storage device which does not store the duplicate data in each physical fault domain in the ceph as the device to be expanded;

the device adding module is used for adding at least one device to be expanded in each virtual fault sub-domain corresponding to a certain Zone storage service needing capacity expansion in the Openstack, and taking the obtained virtual fault sub-domain as an expansion sub-domain, wherein the device to be expanded added in each virtual fault sub-domain and the virtual fault sub-domain belong to the same physical fault domain;

and the second storage module is used for respectively storing the copy data of the storage service of the Zone needing capacity expansion in each capacity expansion sub-domain corresponding to the storage service.

Optionally, the apparatus further comprises:

the sub-storage domain dividing module is used for dividing the storage devices which do not store the copy data in each physical fault domain into sub-storage domains of the service number based on the service number of the storage services of the Zone newly added by the Openstack;

the sub-storage domain selection module is used for selecting a second number of sub-storage domains as capacity expansion storage domains according to the second number of the copy data to be stored of the storage services of each newly added Zone, wherein different sub-storage domains in the selected capacity expansion storage domains belong to different physical fault domains, and the sub-storage domains selected according to the storage services of different newly added zones are different;

and the third storage module is configured to store the to-be-stored copy data of the storage service of each newly-added Zone in the storage device of each capacity expansion storage domain corresponding to the storage service.

Optionally, the apparatus further comprises:

the second grouping module is used for dividing the obtained sub-storage domains into a plurality of groups of sub-storage domains, and different sub-storage domains in each group of sub-storage domains belong to different physical fault domains;

the sub-storage domain selection module is specifically configured to:

and selecting the second number of sub-storage domains as capacity expansion storage domains from one of the plurality of groups of sub-storage domains.

Optionally, the storage device includes a plurality of OSDs, and the duplicate data are stored in the OSDs in the storage device.

The embodiment of the invention also provides electronic equipment which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory finish mutual communication through the communication bus;

the memory is used for storing a computer program;

the processor is configured to implement any of the above method steps when executing the program stored in the memory.

An embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements any of the above method steps.

By using the method for storing copy data in a ceph in an Openstack provided by the embodiment of the present invention, each physical fault domain in the ceph is divided into a plurality of virtual fault sub-domains, and then different virtual fault sub-domains are selected for the storage service of each Zone in the Openstack, so that the storage services of the zones are not affected by each other.

Of course, it is not necessary for any product or method of practicing the invention to achieve all of the above-described advantages at the same time.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

Fig. 1 is a flowchart of a method for storing copy data in ceph in Openstack according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a method for storing copy data in ceph in an Openstack according to an embodiment of the present invention;

fig. 3 is another schematic diagram of a method for storing copy data in ceph in Openstack according to an embodiment of the present invention;

fig. 4 is another schematic diagram of a method for storing copy data in ceph in Openstack according to an embodiment of the present invention;

fig. 5 is another schematic diagram of a method for storing copy data in ceph in Openstack according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a storage device of copy data in ceph in Openstack according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.

The embodiment of the invention provides a method, a device, electronic equipment and a storage medium for storing copy data in Openstack in ceph, which can solve the technical problem that when the copy data in Openstack is stored in ceph in the prior art, when one storage device in a physical fault domain fails, the storage services of a plurality of zones can be influenced.

Referring to fig. 1, fig. 1 is a flowchart of a method for storing copy data in a ceph in an Openstack according to an embodiment of the present invention, where the method may include the following steps:

step S101: acquiring a topological structure of a storage device cluster in ceph, wherein the storage device cluster comprises a plurality of storage devices, and the topological structure represents the division condition of a physical fault domain of the storage device cluster;

in ceph, the data copy may be stored in a storage device, where the storage device may be an electronic device with a storage function, or a magnetic disk, and the like, which is not limited thereto. Before storing data, the topology of the storage device cluster may be obtained first, and the topology of the storage device cluster may be obtained by calling a flush map (cluster level distribution map), where the flush map describes a hierarchy of a ceph storage system, for example, how many racks the storage system includes, how many storage devices each rack includes, how many OSDs (object storage devices) each storage device includes, and the like, where an OSD is a hardware storage unit of an entity with a smaller specification.

In the embodiment of the present invention, the obtained topology structure of the storage device cluster may include a division condition of a physical fault domain, where the physical fault domain is a storage region that is artificially divided, and the division of the physical fault domain is to avoid that when a certain storage region fails, the same copy data stored in the region is affected, so that the same copy data is usually stored in different physical fault domains.

Step S102: dividing each physical fault domain of the storage equipment cluster into a plurality of virtual fault sub-domains;

in this step, the division of the physical failure domain may be modified by modifying ceph rule (a data storage rule in ceph), which defines a constraint condition for storing duplicate data, for example, for three identical duplicate data to be stored, ceph rule defines a rule by which the three duplicate data are stored in three different physical failure domains.

In the embodiment of the present invention, each physical fault domain of a storage device cluster may be divided into a plurality of virtual fault sub-domains by modifying ceph rule.

Step S103: aiming at to-be-stored copy data of storage services of each Zone in Openstack, selecting a first number of virtual fault sub-domains as storage fault domains based on a first number of copy data to be stored, wherein different virtual fault sub-domains in the selected storage fault domains belong to different physical fault domains, and the virtual fault sub-domains selected for the storage services of different zones are different;

step S104: and respectively storing the to-be-stored duplicate data of the storage service of each Zone in the storage equipment of each storage fault domain corresponding to the storage service.

In the embodiment of the invention, after the physical fault domains are divided, the virtual fault sub domains with the number of the copies can be selected to store the copy data according to the number of the copy data. For example, if the number of copies of the storage service of a Zone is three, three virtual fault sub-domains may be selected to store the copy data respectively, and the three selected virtual fault sub-domains belong to different physical fault domains, so that it may be ensured that three identical copy data are stored in three different physical fault domains respectively.

In the embodiment of the present invention, the selected virtual fault sub-domains are different for the storage services of each different Zone. For example, referring to fig. 2, fig. 2 is a schematic diagram of a method for storing copy data in ceph in an Openstack according to an embodiment of the present invention, in the embodiment shown in fig. 2, the number of copies is three, and the number of physical fault domains is also three, as shown in fig. 2, each rack is a physical fault domain, each physical fault domain is divided into three virtual fault sub-domains, and each virtual fault sub-domain includes 3 storage devices. In the embodiment of the present invention, for the storage services of different zones, different virtual fault sub-domains may be selected for data storage.

In the embodiment of the present invention, it is assumed that there are three zones in Openstack, each Zone corresponds to one storage service, and is respectively denoted as service 1, service 2, and service 3, and when copy data in each service is stored, different virtual fault sub-domains may be selected for the copy data. For example, a1, b1 and c1 may be selected to store duplicate data in service 1, a2, b3 and c3 may be selected to store duplicate data in service 2, a3, b2 and c2 may be selected to store duplicate data in service 3, and the selected virtual fault sub-domain may be recorded as a storage fault domain for storage of the duplicate data.

The virtual fault sub-domain selected for the storage service of each Zone in the Openstack belongs to different physical fault domains, so that the same copy data in the storage service of each Zone is ensured to be stored in different physical fault domains. For the storage services of different zones, the selected virtual fault sub-domains are different, so that the duplicate data of the storage services of each Zone are stored in the respective virtual fault sub-domains, the storage services of each Zone are not affected by each other, if a certain storage device in a certain virtual fault sub-domain fails, only the storage service of the Zone corresponding to the virtual fault sub-domain is affected, and the storage services of other zones are not affected.

It can be seen that, by using the method for storing copy data in a ceph in an Openstack provided in the embodiment of the present invention, each physical fault domain of a storage device cluster can be divided into a plurality of virtual fault sub-domains, then, for the storage service of each Zone, virtual fault sub-domains with the same number as the number of copies are selected, and the virtual fault sub-domains selected for the storage service of each Zone are different, and then, the copy data of the storage service of each Zone is stored in the virtual fault sub-domain corresponding to the storage service, so that it can be ensured that the copy data of the storage services of different zones are stored in different virtual fault sub-domains, and therefore, when a certain storage device in a certain virtual fault sub-domain fails, the storage services of multiple zones are not affected.

In this embodiment of the present invention, before selecting the first number of virtual fault sub-domains as the storage fault domains, the method may further include:

dividing the obtained virtual fault sub-domains into a plurality of groups of virtual fault sub-domains, wherein different virtual fault sub-domains in each group of virtual fault sub-domains belong to different physical fault domains;

selecting a first number of virtual fault sub-domains as storage fault domains, comprising:

from one of the plurality of sets of virtual fault sub-domains, a first number of virtual fault sub-domains is selected as storage fault domains.

Referring to fig. 3, in the embodiment shown in fig. 3, each rack is a physical fault domain, and there are 4 physical fault domains in total, which are respectively a rack a, a rack B, a rack C, and a rack D, and each physical fault domain may be divided into a plurality of virtual fault sub-domains, which may be, for example, a1, a2, A3, B1, B2, B3, C1, C2, C3, D1, D2, and D3 shown in fig. 3.

The obtained virtual fault sub-domains can be grouped after the physical fault domains are divided, and different virtual fault sub-domains in each group of virtual fault sub-domains belong to different physical fault domains. For example, a1, B1, C1, D1 may be divided into a group of virtual fault sub-domains, a2, B3, C3 may be divided into a group of virtual fault sub-domains, A3, B2, C2, D3 may be divided into another group of virtual fault sub-domains, and the number of virtual fault sub-domains included in each group may be greater than or equal to the number of copies.

After the grouping is determined, for the storage service of each Zone, a group may be selected from the plurality of groups of virtual fault sub-domains for storing the duplicate data, and since the number of fault domains included in the selected group of virtual fault sub-domains may be greater than the number of duplicates, the number of duplicate virtual fault sub-domains may be selected in the selected group of virtual fault sub-domains, and the duplicate data of the storage service of the Zone is stored in the number of duplicate virtual fault sub-domains respectively. For example, for a service 1 of a certain Zone, if a group of virtual fault sub-domains A3, B2, C2, and D3 is selected, and the number of copies of the service is three, A3, B2, and C2 may be selected from the group of virtual fault sub-domains for storing copy data.

For the storage service of each Zone, the selected virtual fault sub-domains are different, which can ensure that the duplicate data of the storage services of different zones are stored in different virtual fault sub-domains, so that when a certain storage device in a certain virtual fault sub-domain fails, the storage services of multiple zones are not affected.

In practical applications, data to be stored in a storage service of a certain Zone in Openstack may increase, so that capacity needs to be expanded for the storage service, and a new storage device needs to be introduced when capacity is expanded. In the existing data storage manner, after a new storage device is introduced, duplicate data of storage services of multiple zones needs to be reallocated, that is, data migration may be caused, and therefore, when a storage service of a certain Zone is expanded, storage services of other zones may be affected.

In the embodiment of the present invention, if the storage service of a certain Zone in Openstack needs to expand the capacity, a storage device that does not store data currently may be introduced into each virtual failure sub-domain corresponding to the storage service of the Zone.

Specifically, referring to fig. 4, fig. 4 is another schematic diagram of a method for storing copy data in a ceph in an Openstack according to an embodiment of the present invention, in the embodiment shown in fig. 4, there are three racks, which are respectively a rack 1, a rack 2, and a rack 3, where each rack is a physical fault domain, and there are storage services of three zones in an Openstack, where the copy data in the storage service of each Zone is respectively stored in storage devices in the three racks. Referring to fig. 4, each rack includes at least one storage device that does not store copy data, and the storage device that does not store copy data may be determined as a device to be expanded.

As shown in fig. 4, assuming that the storage service of the Zone1 needs to expand the capacity, at least one device to be expanded may be added to each virtual failure sub-domain corresponding to the storage service of the Zone1, for example, storage devices a, B, and C that do not store duplicate data may be added to virtual failure sub-domains a, B, and C corresponding to the storage service of the Zone1, respectively. The storage device a and the virtual fault sub-domain A belong to the same physical fault domain, namely a rack 1; the storage device B and the virtual fault sub-domain B belong to the same physical fault domain, namely a rack 2; the storage device C and the virtual fault subdomain C belong to the same physical fault domain, namely a rack 3; after the addition is completed, the resulting virtual fault sub-domain may be used as a capacity expansion sub-domain.

In the embodiment of the present invention, the copy data of the storage service of the Zone1 whose capacity needs to be expanded may be stored in each expansion sub-domain corresponding to the storage service.

It can be seen that, with the method for storing copy data in ceph in Openstack provided in the embodiment of the present invention, when a storage service of a certain Zone in Openstack needs to expand capacity, storage devices that do not store copy data in each physical failure domain may be respectively added to each virtual failure sub-domain corresponding to the storage service, so as to obtain a plurality of capacity expansion sub-domains, and then copy data in the storage service of the Zone is respectively stored in the capacity expansion sub-domains, which may cause reallocation of copy data in the storage service of the Zone, but may not affect storage services of other zones.

In the embodiment of the invention, when the Openstack has a new storage service of the Zone, the storage device which does not store the copy data in each physical fault domain can be divided into the service number sub-storage domains based on the number of the new storage service which needs to store the copy data and is newly added;

for the to-be-stored copy data of each new storage service, selecting a second number of sub-storage domains as capacity expansion storage domains based on a second number of the copy data to be stored, wherein different sub-storage domains in the selected capacity expansion storage domains belong to different physical fault domains, and the sub-storage domains selected for different new storage services are different;

and respectively storing the to-be-stored copy data of each new storage service in the storage device of each capacity expansion storage domain corresponding to the new storage service.

Fig. 5 may be referred to for explanation, where fig. 5 is another schematic diagram of a method for storing copy data in ceph in an Openstack according to an embodiment of the present invention, in the embodiment shown in fig. 5, each chassis is a physical failure domain, and there are 3 physical failure domains, which are respectively a chassis a, a chassis B, and a chassis C, where virtual failure sub-domains a1, a2, B1, B2, C1, and C2 are already used by storage services of existing zones, and if storage services of two zones are newly added on this basis, for copy data to be stored of each new storage service, based on the copy number of the copy data, the number of sub-storage domains is selected as a capacity expansion storage domain.

Referring to fig. 5, in fig. 5, the storage device that does not store the copy data in each physical failure domain is divided into two sub storage domains, which are a1, a2, b1, b2, c1, c2, d1, and d2, respectively, then for the storage service of the newly added Zone, the sub storage domains with the number of copies can be selected from the storage service, the sub storage domain selected for the storage service of each Zone belongs to a different physical failure domain, and the sub storage domains selected for the storage services of different zones are all different. The replica data to be stored may then be stored in the selected child storage domain.

In this embodiment of the present invention, before selecting the second number of sub-storage domains as the capacity expansion storage domain, the method may further include:

dividing the obtained sub-storage domains into a plurality of groups of sub-storage domains, wherein different sub-storage domains in each group of sub-storage domains belong to different physical fault domains;

selecting a second number of sub-storage domains as a capacity expansion storage domain, comprising:

and selecting a second number of sub-storage domains from one of the plurality of groups of sub-storage domains as capacity expansion storage domains.

Since the process of selecting the sub-storage domain for the storage service of the newly added Zone is basically the same as the process of selecting the virtual fault sub-domain for the storage service in the foregoing, reference may be made to the above-mentioned embodiment, which is not described herein again.

It can be seen that, with the method for storing copy data in a ceph in an Openstack provided in the embodiment of the present invention, if a storage service of a Zone needs to be newly added, a storage device that does not store the copy data in the ceph storage system may be divided into sub-storage domains, for the storage service of each new Zone, the sub-storage domains with the number of copies are selected for storing the copy data, and the sub-storage domains selected for the storage services of different zones are different, so that the storage services of the newly added zones are not affected by each other, and if a certain storage device fails, only one storage service is affected. And because the storage service of the newly added Zone does not occupy the storage device already used by the original storage service, data migration is not caused, that is, the data does not need to be redistributed, and the original storage service is not influenced.

In the embodiment of the present invention, the replica data may be stored in an OSD in each storage device, and each storage device may include a plurality of OSDs, where an OSD is a physical hardware storage unit.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a storage device, in ceph, of copy data in Openstack according to an embodiment of the present invention, where the storage device may include:

a

topology obtaining module

601, configured to obtain a topology of a storage device cluster in ceph, where the storage device cluster includes multiple storage devices, and the topology represents a division condition of a physical fault domain of the storage device cluster;

a fault

sub-domain dividing module

602, configured to divide each physical fault domain of the storage device cluster into a plurality of virtual fault sub-domains;

a fault

sub-domain selecting module

603, configured to select, for to-be-stored copy data of a storage service of each service region Zone in the Openstack, a first number of virtual fault sub-domains as storage fault domains based on a first number of copy data to be stored, where different virtual fault sub-domains in the selected storage fault domains belong to different physical fault domains, and the virtual fault sub-domains selected for the storage services of different zones are different;

the

first storage module

604 is configured to store the to-be-stored duplicate data of the storage service of each Zone in the storage device of each storage fault domain corresponding to the storage service.

In this embodiment of the present invention, on the basis that the copy data in Openstack shown in fig. 6 is stored in the ceph storage device, the method may further include:

the first grouping module is used for dividing the obtained virtual fault sub-domains into a plurality of groups of virtual fault sub-domains, and different virtual fault sub-domains in each group of virtual fault sub-domains belong to different physical fault domains;

the fault sub-domain selection module is specifically configured to:

from one of the plurality of sets of virtual fault sub-domains, a first number of virtual fault sub-domains is selected as storage fault domains.

In this embodiment of the present invention, on the basis that the copy data in Openstack shown in fig. 6 is stored in the ceph storage device, the method may further include:

the device to be expanded determining module is used for determining the storage device which does not store the duplicate data in each physical fault domain in the ceph as the device to be expanded;

the device adding module is used for adding at least one device to be expanded in each virtual fault sub-domain corresponding to a certain Zone storage service needing capacity expansion in the Openstack, and taking the obtained virtual fault sub-domain as an expansion sub-domain, wherein the device to be expanded added in each virtual fault sub-domain and the virtual fault sub-domain belong to the same physical fault domain;

and the second storage module is used for respectively storing the copy data of the storage service of the Zone with the capacity to be expanded in each expansion subdomain corresponding to the storage service.

In this embodiment of the present invention, on the basis that the copy data in Openstack shown in fig. 6 is stored in the ceph storage device, the method may further include:

the sub-storage domain dividing module is used for dividing the storage devices which do not store the copy data in each physical fault domain into sub-storage domains with the number of services based on the number of services of the storage services of the Zone newly added by the Openstack;

the sub-storage domain selection module is used for selecting a second number of sub-storage domains as capacity expansion storage domains according to the second number of the copy data to be stored of the storage services of each newly added Zone based on the second number of the copy data to be stored, wherein different sub-storage domains in the selected capacity expansion storage domains belong to different physical fault domains, and the sub-storage domains selected according to the storage services of different newly added zones are different;

and the third storage module is used for respectively storing the to-be-stored duplicate data of the storage service of each newly-added Zone into the storage device of each capacity expansion storage domain corresponding to the storage service.

The second grouping module is used for dividing the obtained sub-storage domains into a plurality of groups of sub-storage domains, and different sub-storage domains in each group of sub-storage domains belong to different physical fault domains;

the sub-storage domain selection module is specifically configured to:

and selecting a second number of sub-storage domains from one of the plurality of groups of sub-storage domains as capacity expansion storage domains.

In the embodiment of the invention, a plurality of object storage units OSD are included in the storage device, and the copy data are all stored in the OSD in the storage device.

It can be seen that, with the storage apparatus of copy data in Openstack in ceph provided in the embodiment of the present invention, if a storage service of a Zone needs to be newly added, a storage device that does not store the copy data in the ceph storage system may be divided into sub-storage domains, for the storage service of each new Zone, the sub-storage domains with the number of copies are selected for storing the copy data, and the sub-storage domains selected for the storage services of different zones are different, so that the storage services of the newly added zones are not affected each other, and if a certain storage device fails, only one storage service is affected. And because the storage service of the newly added Zone does not occupy the storage device already used by the original storage service, data migration is not caused, that is, the data does not need to be redistributed, and the original storage service is not influenced.

An embodiment of the invention discloses an electronic device, as shown in fig. 7. Comprises a

processor

701, a

communication interface

702, a

memory

703 and a

communication bus

704, wherein the

processor

701, the

communication interface

702 and the

memory

703 are communicated with each other through the

communication bus

704,

a

memory

703 for storing a computer program;

the

processor

701 is configured to implement any of the method steps described above when executing the program stored in the

memory

703.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the electronic equipment and other equipment.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions according to the embodiments of the invention are brought about in whole or in part when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, the electronic device, the computer-readable storage medium, and the computer program product embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiments.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (14)

1. A method for storing copy data in a distributed file system (ceph) in a cloud computing management platform (Openstack) is characterized by comprising the following steps:

acquiring a topological structure of a storage device cluster in ceph, wherein the storage device cluster comprises a plurality of storage devices, and the topological structure represents the division condition of a physical fault domain of the storage device cluster;

dividing each physical fault domain of the storage equipment cluster into a plurality of virtual fault sub-domains;

for to-be-stored copy data of storage services of each service area Zone in an Openstack, selecting a first number of virtual fault sub-domains as storage fault domains based on a first number of copy data to be stored, wherein different virtual fault sub-domains in the selected storage fault domains belong to different physical fault domains, and the virtual fault sub-domains selected for the storage services of different zones are different;

and respectively storing the to-be-stored duplicate data of the storage service of each Zone in the storage device of each storage fault domain corresponding to the storage service.

2. The method of claim 1, further comprising, prior to selecting the first number of virtual fault sub-domains as storage fault domains:

dividing the obtained virtual fault sub-domains into a plurality of groups of virtual fault sub-domains, wherein different virtual fault sub-domains in each group of virtual fault sub-domains belong to different physical fault domains;

the selecting the first number of virtual fault sub-domains as storage fault domains comprises:

selecting the first number of virtual fault sub-domains as storage fault domains from a set of virtual fault sub-domains of the plurality of sets of virtual fault sub-domains.

3. The method of claim 1, further comprising:

determining storage equipment which does not store the duplicate data in each physical fault domain in the ceph as equipment to be expanded;

aiming at a certain storage service of a Zone needing capacity expansion in Openstack, adding at least one device to be expanded in each virtual fault sub-domain corresponding to the storage service of the Zone, and taking the obtained virtual fault sub-domain as a capacity expansion sub-domain, wherein the device to be expanded added in each virtual fault sub-domain and the virtual fault sub-domain belong to the same physical fault domain;

and respectively storing the copy data of the storage service of the Zone needing capacity expansion in each capacity expansion subdomain corresponding to the storage service.

4. The method of claim 1, further comprising:

dividing storage equipment which does not store copy data in each physical fault domain into service number sub-storage domains based on the service number of storage services of the Zone newly added by the Openstack;

for the to-be-stored copy data of the storage service of each newly-added Zone, based on a second quantity of the copy data to be stored, selecting the second quantity of sub-storage domains as capacity expansion storage domains, wherein different sub-storage domains in the selected capacity expansion storage domains belong to different physical fault domains, and the sub-storage domains selected for the storage services of different newly-added zones are different;

and respectively storing the to-be-stored copy data of the storage service of each newly-added Zone in the storage device of each capacity expansion storage domain corresponding to the storage service.

5. The method of claim 4, further comprising, prior to selecting the second number of sub-storage domains as the flash storage domains:

dividing the obtained sub-storage domains into a plurality of groups of sub-storage domains, wherein different sub-storage domains in each group of sub-storage domains belong to different physical fault domains;

the selecting the second number of sub-storage domains as a capacity expansion storage domain includes:

and selecting the second number of sub-storage domains as capacity expansion storage domains from one of the plurality of groups of sub-storage domains.

6. The method according to any one of claims 1-5, wherein the storage device comprises a plurality of object storage units (OSD), and the duplicate data are stored in the OSD in the storage device.

7. An apparatus for storing replica data in a distributed file system ceph in a cloud computing management platform Openstack, the apparatus comprising:

the topology structure obtaining module is used for obtaining a topology structure of a storage device cluster in ceph, wherein the storage device cluster comprises a plurality of storage devices, and the topology structure represents the division condition of a physical fault domain of the storage device cluster;

the fault sub-domain dividing module is used for dividing each physical fault domain of the storage device cluster into a plurality of virtual fault sub-domains;

the fault sub-domain selection module is used for selecting a first number of virtual fault sub-domains as storage fault domains according to a first number of copy data to be stored of storage services of each service Zone in Openstack, wherein different virtual fault sub-domains in the selected storage fault domains belong to different physical fault domains, and the virtual fault sub-domains selected according to the storage services of different zones are different;

and the first storage module is used for respectively storing the to-be-stored duplicate data of the storage service of each Zone into the storage device of each storage fault domain corresponding to the storage service.

8. The apparatus of claim 7, further comprising:

the first grouping module is used for dividing the obtained virtual fault sub-domains into a plurality of groups of virtual fault sub-domains, and different virtual fault sub-domains in each group of virtual fault sub-domains belong to different physical fault domains;

the fault sub-domain selection module is specifically configured to:

selecting the first number of virtual fault sub-domains as storage fault domains from a set of virtual fault sub-domains of the plurality of sets of virtual fault sub-domains.

9. The apparatus of claim 7, further comprising:

the device to be expanded determining module is used for determining the storage device which does not store the duplicate data in each physical fault domain in the ceph as the device to be expanded;

the device adding module is used for adding at least one device to be expanded in each virtual fault sub-domain corresponding to a certain Zone storage service needing capacity expansion in the Openstack, and taking the obtained virtual fault sub-domain as an expansion sub-domain, wherein the device to be expanded added in each virtual fault sub-domain and the virtual fault sub-domain belong to the same physical fault domain;

and the second storage module is used for respectively storing the copy data of the storage service of the Zone needing capacity expansion in each capacity expansion sub-domain corresponding to the storage service.

10. The apparatus of claim 7, further comprising:

the sub-storage domain dividing module is used for dividing the storage devices which do not store the copy data in each physical fault domain into sub-storage domains of the service number based on the service number of the storage services of the Zone newly added by the Openstack;

the sub-storage domain selection module is used for selecting a second number of sub-storage domains as capacity expansion storage domains according to the second number of the copy data to be stored of the storage services of each newly added Zone, wherein different sub-storage domains in the selected capacity expansion storage domains belong to different physical fault domains, and the sub-storage domains selected according to the storage services of different newly added zones are different;

and the third storage module is configured to store the to-be-stored copy data of the storage service of each newly-added Zone in the storage device of each capacity expansion storage domain corresponding to the storage service.

11. The apparatus of claim 10, further comprising:

the second grouping module is used for dividing the obtained sub-storage domains into a plurality of groups of sub-storage domains, and different sub-storage domains in each group of sub-storage domains belong to different physical fault domains;

the sub-storage domain selection module is specifically configured to:

and selecting the second number of sub-storage domains as capacity expansion storage domains from one of the plurality of groups of sub-storage domains.

12. The apparatus according to any one of claims 7-11, wherein the storage device comprises a plurality of object storage units OSD, and the duplicate data is stored in the OSD in the storage device.

13. An electronic device, comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete communication with each other through the communication bus;

the memory is used for storing a computer program;

the processor, when executing the program stored in the memory, implementing the method steps of any of claims 1-6.

14. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 6.

CN201810500277.9A 2018-05-23 2018-05-23 Method and device for storing copy data in Openstack in ceph Active CN108804568B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810500277.9A CN108804568B (en) 2018-05-23 2018-05-23 Method and device for storing copy data in Openstack in ceph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810500277.9A CN108804568B (en) 2018-05-23 2018-05-23 Method and device for storing copy data in Openstack in ceph

Publications (2)

Publication Number Publication Date
CN108804568A CN108804568A (en) 2018-11-13
CN108804568B true CN108804568B (en) 2021-07-09

Family

ID=64092743

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810500277.9A Active CN108804568B (en) 2018-05-23 2018-05-23 Method and device for storing copy data in Openstack in ceph

Country Status (1)

Country Link
CN (1) CN108804568B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918229B (en) * 2019-02-18 2021-03-30 国家计算机网络与信息安全管理中心 Database cluster copy construction method and device in non-log mode
CN112511578B (en) * 2019-09-16 2022-06-24 大唐移动通信设备有限公司 Data storage method and device
CN114968072A (en) * 2021-02-26 2022-08-30 中移(苏州)软件技术有限公司 Double-copy file storage method, device, equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103268252A (en) * 2013-05-12 2013-08-28 南京载玄信息科技有限公司 Virtualization platform system based on distributed storage and achieving method thereof
CN103778255A (en) * 2014-02-25 2014-05-07 深圳市中博科创信息技术有限公司 Distributed file system and data distribution method thereof
US11483405B2 (en) * 2015-06-10 2022-10-25 Platform9, Inc. Private cloud as a service
CN107729177A (en) * 2017-09-18 2018-02-23 中国科学院信息工程研究所 Backup data store management method, device and system based on cloud storage
CN107704212B (en) * 2017-10-31 2019-09-06 新华三信息技术有限公司 A kind of data processing method and device
CN107943421B (en) * 2017-11-30 2021-04-20 成都华为技术有限公司 Partition division method and device based on distributed storage system

Also Published As

Publication number Publication date
CN108804568A (en) 2018-11-13

Similar Documents

Publication Publication Date Title
CN108829738B (en) 2020-12-25 Data storage method and device in ceph
CN108897628B (en) 2020-06-26 Method and device for realizing distributed lock and electronic equipment
CN108804568B (en) 2021-07-09 Method and device for storing copy data in Openstack in ceph
CN111049928B (en) 2022-03-29 Data synchronization method, system, electronic device and computer readable storage medium
CN106648440B (en) 2020-07-24 Control method for operating storage device and storage device
CN107133228A (en) 2017-09-05 A kind of method and device of fast resampling
CN104615606A (en) 2015-05-13 Hadoop distributed file system and management method thereof
CN108540315A (en) 2018-09-14 Distributed memory system, method and apparatus
CN112748867B (en) 2024-04-19 Method, electronic device and computer program product for storage management
CN112463058B (en) 2023-04-07 Fragmented data sorting method and device and storage node
CN113392067B (en) 2025-02-14 A data processing method, device and system for distributed database
CN111143113A (en) 2020-05-12 Method, electronic device and computer program product for copying metadata
CN111562884A (en) 2020-08-21 Data storage method and device and electronic equipment
JP2021064078A (en) 2021-04-22 Apparatus for creating extended configuration proposal of storage system including multiple nodes
CN108846009B (en) 2021-02-05 Copy data storage method and device in ceph
CN114816241B (en) 2024-07-19 Disk allocation method and device of distributed storage cluster and distributed storage cluster
CN115756955A (en) 2023-03-07 Data backup and data recovery method and device and computer equipment
CN112650692A (en) 2021-04-13 Heap memory allocation method, device and storage medium
CN111782634B (en) 2024-06-14 Data distributed storage method, device, electronic equipment and storage medium
CN111046004A (en) 2020-04-21 Data file storage method, device, equipment and storage medium
CN112748848A (en) 2021-05-04 Method, apparatus and computer program product for storage management
CN111404828A (en) 2020-07-10 Method and device for realizing global flow control
CN112000482B (en) 2024-03-05 Memory management method and device, electronic equipment and storage medium
CN113867928B (en) 2024-12-31 Load balancing method, device and server
CN115687359A (en) 2023-02-03 Data table partitioning method and device, storage medium and computer equipment

Legal Events

Date Code Title Description
2018-11-13 PB01 Publication
2018-11-13 PB01 Publication
2018-12-07 SE01 Entry into force of request for substantive examination
2018-12-07 SE01 Entry into force of request for substantive examination
2021-07-09 GR01 Patent grant
2021-07-09 GR01 Patent grant