US8607086B2 - Massively multicore processor and operating system to manage strands in hardware - Google Patents
- ️Tue Dec 10 2013
Info
-
Publication number
- US8607086B2 US8607086B2 US13/333,802 US201113333802A US8607086B2 US 8607086 B2 US8607086 B2 US 8607086B2 US 201113333802 A US201113333802 A US 201113333802A US 8607086 B2 US8607086 B2 US 8607086B2 Authority
- US
- United States Prior art keywords
- cpus
- stack
- memory
- protocol
- operating system Prior art date
- 2011-09-02 Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/3237—Power saving characterised by the action undertaken by disabling clock generation or distribution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/324—Power saving characterised by the action undertaken by lowering clock frequency
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power management, i.e. event-based initiation of a power-saving mode
- G06F1/3234—Power saving characterised by the action undertaken
- G06F1/3287—Power saving characterised by the action undertaken by switching off individual functional units in the computer system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5066—Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5094—Allocation of resources, e.g. of the central processing unit [CPU] where the allocation takes into account power or heat criteria
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the application generally relates to computing devices having multiple processors and, more specifically, to a multicore processor and operating system based on a protocol stack.
- Computing devices such as desktop computers, laptop computers, cell phones, smartphones, personal digital assistants (PDA), and many other electronic devices are widely deployed.
- the primary element of such computing devices is a central processing unit (CPU), or a processor, which is responsible for executing instructions of one or more computer programs.
- the CPU executes each program instruction in sequence to perform the basic arithmetical, logical, and input/output operations of the computing device.
- Design and implementation of such devices in general, and CPUs in particular, may vary; however, their fundamental functionalities remain very similar.
- the CPU is coupled to a memory and an Input/Output (I/O) subsystem, directly or through a bus, to perform the main functions of computing devices such as inputting and outputting data, processing data, and so forth.
- the memory may embed an operating system (OS), computer programs, applications, and so forth.
- OS operating system
- Conventional operating systems are quite similar in architecture, in that each tends to have conventional file and memory operations, storage and graphical user interface operations, and so forth.
- Architectures of conventional operating systems include a layered design, device drivers, and Application Programming Interfaces (APIs).
- a core kernel essentially has master control over all the operations of the overlying software, components, device drivers, applications, and so forth.
- operating systems implement ‘multi-tasking’ through time slicing and sequential allocation of computer resources to various threads and processes.
- a thread generally runs within a process and shares resources, e.g., memory, with other threads within the same process, whereas a process generally runs ‘self-contained’ within its own right and completely independently of any other process.
- multi-tasking when a computing device includes a single processor, the operating system instructs the processor to switch between different threads and implement them sequentially. Switching generally happens frequently enough that the user may perceive the threads (or tasks) as running simultaneously.
- multicore processors which may truly allocate multiple threads or tasks to run at the same time on different cores.
- conventional multicore processor architectures involve a small number of cores (typically 2, 4, 6, or 8 cores) due to the design limitations of traditional hardware and traditional operating systems.
- the computing device still must implement time slicing and switching between different threads on each of its cores when performing several tasks involving multithreading allocated through the cores. In other words, even conventional multicore processors cannot implement true multitasking.
- a computing device having multiple CPUs interconnected to each other.
- Each CPU embeds an operating system of an entirely new architecture.
- This operating system may be based fundamentally around an Internet stack, for example, the TCP/IP stack (instead of including a TCP/IP layer as in a conventional core operating system) and may utilize a conventional interface or similar extensions of the standard Berkeley Sockets (or WinSock) APIs.
- a computing apparatus may comprise a set of interconnected central processing units.
- Each CPU may embed an operating system (OS) comprising an operating system kernel, the operating system kernel being a state machine and comprising a protocol stack.
- OS operating system
- At least one of the CPUs may further embed executable instructions for allocating multiple strands to one or more other CPUs of the set of interconnected CPUs.
- a strand as used herein, is a hardware oriented process and is not necessarily similar to a conventional unit of processing (i.e., a thread) that can be scheduled by an operating system.
- the Internet stack is a set of communication protocols used for the Internet and other similar networks.
- the Internet stack may comprise a TCP/IP stack such that the OS kernel is a TCP/IP stack state machine with proprietary extensions that can be used to change or access internals of the TCP/IP stack state machine.
- the Internet stack may comprise a User Datagram Protocol/Internet Protocol (UDP/IP) stack such that the OS kernel is a UDP/IP stack state machine with proprietary extensions that can be used to change or access internals of the UDP/IP stack state machine.
- the CPU may comprise a processing unit, a memory and an I/O interface.
- Executable instructions for the operating system may be stored within one or more types of storage media, such as for example, Read-Only Memory (ROM), Programmable Read-Only Memory (PROM), Field Programmable Read-Only Memory (FPROM), One-Time Programmable Read-Only Memory (OTPROM), One-Time Programmable Non-Volatile Memory (OTP NVM), Erasable Programmable Read-Only Memory (EPROM), and Electrically Erasable Programmable Read-Only Memory (EEPROM or Flash ROM).
- ROM Read-Only Memory
- PROM Programmable Read-Only Memory
- FPROM Field Programmable Read-Only Memory
- OTPROM One-Time Programmable Read-Only Memory
- OTP NVM One-Time Programmable Non-Volatile Memory
- EPROM Erasable Programmable Read-Only Memory
- EEPROM or Flash ROM Electrically Erasable Programmable Read-Only Memory
- the computing apparatus may further comprise at least one asynchronous clock to serve as an internal clock for the operating system.
- the asynchronous clock may be configurable to automatically stop when clock cycles are no longer needed.
- a time reference for the operating system kernel may be based, for example, on a Network Time Protocol (NTP), Simple Network Time Protocol (SNTP), or other suitable time protocol from a remote time server.
- NTP Network Time Protocol
- SNTP Simple Network Time Protocol
- the operating system may utilize a Sockets style API of sockets and ports on IP addresses for handling I/O requests.
- the set of CPUs may be interconnected through a bus. Executable instructions for the operating system may be executed through a Sockets API.
- the at least one CPU that embeds executable instructions for allocating multiple strands may further comprise instructions for generating multiple strands.
- a method for operating a computing apparatus may comprise receiving I/O requests, generating multiple strands according to the I/O requests, allocating the multiple strands to one or more CPUs of a set of CPUs, and processing the multiple strands.
- Each CPU may embed an operating system (OS) having a kernel comprising a protocol stack.
- OS operating system
- the I/O requests may be received by a CPU, which embeds executable instructions for allocating multiple strands through multiple CPUs.
- Allocating multiple strands may comprise communicating data via a network interface.
- the method may further comprise assembling results of multiple strands processing.
- Executable instructions for the operating system may be stored in a memory and executed through a Sockets API.
- a non-transitory computer-readable storage medium having embodied instructions thereon, instructions executable by a processor in a computing device to perform a method.
- the method may comprise receiving an input/output (I/O) request, generating one or more strands according to the I/O request, allocating the one or more strands and/or processes to one or more central processing units (CPUs) of a set of CPUs, wherein each CPU of the set embeds an operating system (OS) having a kernel comprising a protocol stack, and processing the one or more strands and/or processes.
- I/O input/output
- CPUs central processing units
- OS operating system
- FIG. 1 is a block diagram of a CPU, according to various exemplary embodiments.
- FIG. 2 illustrates an exemplary architecture of an Internet stack state machine-based system, according to various embodiments.
- FIG. 3 is a flow chart illustrating a method for a CPU embedding a protocol stack-based operating system, according to an exemplary embodiment.
- FIG. 4 is a block scheme of a computing device, according to various exemplary embodiments.
- FIG. 5 is a computing environment, according to various exemplary embodiments.
- FIG. 6 is a flow chart of a method for processing I/O requests by a computing device comprising multiple CPUs with embedded Internet stack-based operating systems, according to an exemplary embodiment.
- Various embodiments disclosed herein relate to computing devices comprising a set of interconnected CPUs.
- the number of the CPUs is not limited, and may be more than 100, or even more than 10,000, depending on specific application of the computing devices.
- the CPUs may be interconnected (e.g., through one or more buses) so that multiple strands, processes, and tasks can be allocated among a few or even all CPUs, thereby implementing parallelism or true multi-tasking.
- each of some or all of the CPUs is allocated a respective strand.
- the term “central processing unit” relates to a processor, a microprocessor, a controller, a microcontroller, a chip, or other processing device that carries out arithmetic and logic instructions of an operating system, a computer program, an application, or the like.
- the CPU comprises a processing unit (typically including an arithmetic logic unit and a control unit) and a memory (also known as “registers,” or Read Only Memory (ROM)).
- the CPU may further comprise an I/O Subsystem (Interface) to allow data transfer between the CPU and any other devices such as another CPU or I/O devices such as a keyboard, mouse, printer, monitor, network controller, and so forth.
- I/O Subsystem Interface
- the CPU memory may store an operating system based entirely on a protocol stack.
- a protocol stack as used herein, is a particular software implementation of a computer networking protocol suite.
- the protocol stack may be a TCP/IP stack, UDP/IP stack, Internet Control Message Protocol (ICMP) stack, combinations thereof, or other protocols.
- ICMP Internet Control Message Protocol
- the operating system embedded in the CPU is fundamentally a state machine.
- the kernel of the operating system is fundamentally a protocol stack.
- Such an operating system is inherently Internet-oriented and all Internet type functionality is natural and inherent in its protocol stack-based processor design and implementation.
- such an operating system may operate within small hardware, be run by very compact and efficient software, possess minimal clock cycles for execution, have a natural Internet connectivity model and ultra low power consumption.
- FIG. 1 illustrates a block diagram of an exemplary CPU 100 .
- the CPU 100 may be a processor, a microprocessor, a chip, or the like.
- the CPU 100 may include a memory 110 , which may embed an operating system and, optionally, further software applications.
- the operating system may comprise a kernel to provide communications between software and hardware components/modules.
- the kernel may be a state machine with extensions and may comprise an Internet stack.
- the Internet stack may include a set of communication protocols used for the Internet and similar networks.
- the Internet stack may include a TCP/IP stack so that the OS kernel is a TCP/IP stack state machine.
- the Internet stack includes a UDP/IP stack such that the OS kernel is a UDP/IP stack state machine.
- the Internet stack includes a ICMP stack such that the OS kernel is a ICMP stack state machine.
- the memory 110 may store one or more modules. Exemplary modules, which may be stored in the memory 110 , include an I/O request receiver module 120 , a protocol handling module 130 , an I/O request processing module 140 , and an optional network interface module 150 . It will be appreciated by one skilled in the art that the technology described herein encompasses those embodiments where one or more of the modules may be combined with each other or not included in the memory 110 at all.
- the CPU 100 may further include a processing unit 160 for executing various instructions and running modules stored in the memory 110 .
- the processing unit 160 may comprise an arithmetic logic to carry out mathematical functions, and a control unit to regulate data flow through the processing unit 160 and the CPU 100 .
- Those skilled in the art would understand that any suitable architecture of the processing unit 160 is applicable.
- a module should be generally understood as one or more applications (routines) that perform various system-level functions and may be dynamically loaded and unloaded by hardware and device drivers as required.
- the modular software components described herein may also be integrated as part of an application specific component.
- the modules may each include executable instructions for the operating system embedded into CPU 100 and may be executed through a Sockets API.
- the I/O request receiver module 120 may be configured to receive I/O requests.
- the requests may be from an application residing in an application layer of a computing device (as described in further detail with respect to FIG. 2 ).
- the protocol handling module 130 may be configured to handle a specific protocol for the protocol stack state machine implementation.
- the protocol may be a TCP/IP stack such that the operating system is a TCP/IP stack state machine.
- the protocol stack may include a different protocol stack (e.g., a UDP/IP stack or ICMP stack which may be used in addition to or in place of the TCP/IP stack).
- the operating system may utilize Sockets style API of sockets and ports on IP addresses for handling I/O requests.
- the I/O request processing module 140 may be configured to process the I/O requests from an application according to the network protocol using the operating system.
- the optional network interface module 150 may be included and is configured to provide an interface between the protocol stack state machine and a network interface.
- the corresponding network interface may be a hardware unit or a “soft” Ethernet controller.
- the CPU 100 may also comprise a clock.
- the CPU 100 may require a clock to drive the state transitions as the CPU 100 , for instance, reads and decodes opcodes. Conventionally this is done by some external oscillator circuitry, typically driven by a fixed-frequency crystal. However, clocking may also be done by more than one crystal, e.g. a high frequency crystal (e.g., 50 MHz) one for the main CPU core, and other (lower frequency) crystals for other uses, e.g., programmable timers, watchdog timers etc.
- a system comprising for instance a Universal Asynchronous Receiver/Transmitter (UART) and a Network Interface Controller (NIC) also typically require clock inputs of some sort. For instance, a UART may need a reliable clock source all the way from perhaps 300 baud up to 921,600 baud. A NIC running 100 MBit Ethernet would typically need a clock source of 50 MHz or 25 MHz.
- a computer system needs to keep track of time, and can do so using internal counters to keep track of its internal clocks.
- the device is connected to the Internet and thus has readily available external time sources, for instance from Network Time Protocol (NTP), Simple Network Time Protocol (SNTP), or other suitable time protocols from a remote server (i.e., time protocol servers).
- NTP Network Time Protocol
- SNTP Simple Network Time Protocol
- time protocol servers i.e., time protocol servers
- the processing unit 160 may utilize a time reference using the NTP, SNTP, or other suitable time protocol from a remote time server.
- the Precision Time Protocol PTP
- LAN Local Area Network
- an asynchronous (variable) clock may serve as an internal clock for the operating system for the CPU 100 .
- the asynchronous clock may be configurable to automatically stop when clock cycles are no longer needed.
- the asynchronous system clock may be restarted by a wake-up “daemon” signal from the SNMP daemon (for example, an incoming data packet).
- the internal clock may be used to trigger the CPU 100 .
- the internal clock may be utilized until the CPU 100 is fully active, at which time most or all of the clock requirements may be transitioned to external time protocols, e.g., using Internet time servers using NTP, SNTP, or other suitable time protocols from a remote time server, or using PTP and SNMP to take over the control of the clocking operations. This would mean that internal clock circuitry for CPU 100 could be turned off, thus conserving power.
- Executable instructions for the CPU 100 may be optimized to be more efficient than conventional CPUs so that much lower clock rates are used.
- a self-adjusting cycle rate may be provided depending on the load and function to be performed.
- self-learning or predetermined algorithms for expected scenarios may be utilized to put the CPU 100 into a ‘sleep’ or ‘doze’ mode.
- An expected external event may cause the CPU 100 to exit the doze mode, resume full speed operation to execute necessary operations and handle the external event, and return back to doze.
- the CPU register contents may be read and stored in special registers with long deep-sleep data maintaining capabilities. Such clock saving measures may yield substantial power savings.
- FIG. 2 illustrates an exemplary architecture 200 for a TCP/IP stack state machine-based system, according to various embodiments.
- the operating system kernel may include various components operating between applications 210 and hardware 220 .
- the kernel may include a TCP stack 232 , UDP stack 236 , and/or ICMP stack 240 , around which the operating environment may be built.
- the kernel may include TCP extensions 230 , UDP extensions 234 , ICMP extensions 238 , which together with the respective TCP stack 232 , the UDP stack 236 , and the ICMP stack 240 are shown above an IP layer 250 .
- the kernel may include one or more device drivers 260 , 262 , and 264 , as well as an Ethernet controller 270 .
- the API for all operations of the operating system may include the conventional Berkeley Sockets style API of sockets and ports on IP addresses.
- the Berkeley Sockets may specify the data structures and function calls that interact with the network subsystem of the operating system.
- the kernel may handle the normal Sockets APIs.
- the Sockets API 280 may also include some optimized APIs.
- Any non-conventional functions may be handled in a similar manner (e.g., by opening sockets and binding to ports).
- accessing of local input and output e.g., keyboards, mice, and display screens
- socket/port operations e.g., accessing of local input and output
- a keyboard could be at a local host at, for example, 127.0.0.1, or remote at another IP address.
- this transparency may be an aspect of other operating systems, it may not be inherent in the operating system design from the outset.
- the size of a basic kernel may be very small in a minimal configuration, perhaps as small as a few hundred bytes.
- the Windows Sockets technology above is mentioned merely for the purpose of providing an example. In contrast to the present technology, in the Windows Sockets technology communications with a display device over the Internet may be cumbersome.
- FIG. 3 is a flow chart illustrating an exemplary method 300 for a CPU embedding a protocol stack-based operating system.
- the method 300 may commence at operation 310 with receiving an I/O request.
- the request may be from an application residing in an applications layer 210 of a computing device.
- the network protocol may be determined.
- the protocol is TCP/IP, so that the operating system is a TCP/IP stack state machine.
- the protocol is UDP/IP.
- UDP is an unreliable connectionless protocol sitting on top of IP
- TCP is a connection-oriented reliable protocol.
- the protocol may be a hybrid of TCP and UDP, wherein a data connection stream includes a mixture of UDP and TCP packets.
- UDP has less overhead and is suitable for lower-importance information
- TCP has a higher overhead but essentially guarantees delivery.
- a stream of data comprising non-essential information (such as low-importance data) mixed with critical data could better be transmitted over such a hybrid link.
- This hybrid protocol may be determined in operation 320 .
- the I/O request may be processed according to the network protocol.
- the processing may be performed by the state machine that is the operating system (e.g., a TCP/IP stack state machine operating system).
- the operating system may utilize a Sockets style API of sockets and ports on IP addresses for handling I/O requests.
- the conventional Berkeley Sockets style API of sockets and ports on IP addresses may be used.
- the Berkeley Sockets may specify the data structures and function calls that interact with the network subsystem of the operating system.
- FIG. 4 is a block scheme of a computing device 400 , according to an exemplary embodiment.
- the computing device 400 may comprise five CPUs 410 , 412 , 414 , 416 , and 418 . Despite the fact that five CPUs are shown, it will be appreciated by one skilled in the art that any number of CPUs may be used in the computing device 400 . Some embodiments may include up to 10,000 CPUs or even more.
- the CPUs 410 , 412 , 414 , 416 , and 418 may all be coupled to a bus line 420 so that they may communicate data amongst each other.
- each CPU embeds an operating system based on a protocol stack.
- the protocol stack may be a TCP/IP protocol stack, UDP/IP stack, combinations thereof (i.e., hybrid stack), or other appropriate protocols.
- TCP/IP protocol stack UDP/IP stack
- combinations thereof i.e., hybrid stack
- the CPUs 410 , 412 , 414 , 416 , and 418 may each include a memory storing an operating system and/or any further executable instructions and/or data.
- the memory can be implemented within the CPU or externally. In one example, all CPUs 410 , 412 , 414 , 416 , and 418 may share a single memory coupled to the bus 420 .
- the term “memory” refers to any type of long term, short term, volatile, nonvolatile, or other storage devices and is not limited to any particular type of memory or number of memories, or type of media upon which memory is stored.
- the CPUs 410 , 412 , 414 , 416 , and 418 in the example in FIG. 4 may further comprise an I/O Interface (not shown) implemented as software and/or hardware.
- I/O Interface implemented as software and/or hardware.
- One particular example of software implementation of the I/O Interface is shown as optional Network Interface Module 150 in FIG. 1 .
- a hardware implementation may comprise an I/O controller, a Network Interface Controller (NIC) as an Ethernet controller, or the like. It will be apparent to those skilled in the art that the I/O interface may support any communication standards and provide communications over a serial connection, parallel connection, firewire connection, Ethernet connection, and so forth.
- Each of the CPUs may further comprise a clock (not shown), which can be implemented within each CPU or externally. According to various embodiments, a single clock may be shared by all CPUs.
- One or more of the CPUs may embed a Master Control Program (MCP) 430 .
- MCP Master Control Program
- the CPU 410 embeds the MCP 430 .
- the MCP 430 is an application or a routine for managing operations of the remaining CPUs 412 , 414 , 416 , and 418 and, therefore, the CPU 410 may be considered a “Master Core.” More specifically, the MCP 430 may be configured to receive I/O requests from outside devices, generate multiple strands (processes, tasks) according to the I/O requests, and allocate these strands (processes) to the other CPUs 412 , 414 , 416 , and 418 so that the overall computational load is selectively distributed among the CPUs 412 , 414 , 416 , and 418 .
- strands may be allocated to some of the CPUs 412 , 414 , 416 , and 418 , or to just one CPU.
- each of a number of CPUs i.e., one, some, or all of the CPUs
- the results of the computations may be assembled in the Master Core for further outputting.
- the CPUs 412 , 414 , 416 , and 418 may deliver results directly to corresponding external devices.
- the computing device 400 may comprise several Master Cores for processing different types of I/O requests.
- one Master Core may process all incoming I/O requests, while other Master Cores may be utilized for assembling the output of multiple CPUs, and transmitting of the assembled output results to corresponding outside devices.
- other Master Cores may be utilized for assembling the output of multiple CPUs, and transmitting of the assembled output results to corresponding outside devices.
- the MCP physically allocates a hardware core stack to the strand (or process).
- An allocated core stack/strand combination may also be referred to as a “core strand”.
- the cores (or core strands) may form a massive array in which core strands may be wired as a block to share resources (e.g., memory), or allowed to share the resources over their interconnects. Cores in the (massive) array of cores may be connected to each other, e.g., interconnected by a web-like structure.
- Cores may be allocated processes in some embodiments, i.e., cores which are processes or “process cores”. Such exemplary process cores are naturally isolated from other process cores since processes run independently of other processes, each process containing their own resources, in contrast to strands where resources may be shared therebetween.
- the computing device 400 allows only a certain number of CPUs to operate while the remaining CPUs, not involved in the processing, are turned off.
- the computing device 400 may comprise 1,000 CPUs and a single Master Core.
- the Master Core may generate 600 strands (variously within a number of processes) and allocate them to 600 CPUs.
- the remaining 400 CPUs may be turned off to conserve power. If another 100 strands later become needed, 100 of the 400 CPUs may be turned on in response to the allocating of the 100 strands to them so that the total number of the CPUs executing instructions becomes 700.
- the overall power consumption is reduced compared to the traditional system where all processors run all the time, even if there is no process or strand to execute.
- the computing device 400 may facilitate greater stability of operations when compared to conventional multicore processors. When one of the strands crashes, for example, due to a poorly written routine or for some other reason, only the CPU running the strand is affected, while other CPUs remain unaffected. This is in contrast to conventional systems where the entire multicore processor may become affected by a single strand crash.
- FIG. 5 illustrates an exemplary embodiment of a computing environment 500 .
- the computing environment 500 may comprise a computing device 510 (which is described in greater detail with reference to FIG. 4 ), a memory 520 , a clock 530 , and communication ports 540 , all of which may be coupled to a bus 550 .
- the memory 520 may include any memory configured to store and retrieve data. Some examples of the memory 520 include storage devices, such as a hard disk, magnetic tape, any other magnetic medium, a CD-ROM disk, digital video disk (DVD), any other optical medium, any other physical medium with patterns of marks or holes, a RAM, a ROM, a PROM, an EPROM, an EEPROM, a FLASHEPROM, OTPROM, OTP NVM, Flash ROM or any other memory chip or cartridge, or any other medium from which a computer can read instructions.
- the memory 520 may comprise a data structure configured to hold and organize data.
- the memory 520 may comprise executable instructions of the operating system and/or other routines and applications.
- the memory 520 may also comprise a MCP, as described above with reference to FIG. 4 .
- the clock 530 may serve as an asynchronous clock for the operating system for one or more CPUs of the computing device 510 .
- the asynchronous clock may be configured to automatically stop when clock cycles are not needed.
- Communication ports 540 represent a connection interface that allows asynchronous transmission of data between the computing environment 500 and any edge devices such as a keyboard, mouse, monitor, printer, CD-ROM drive, network controller, and so forth.
- the computing environment 500 may be implemented as a desktop computer, a laptop computer, a mobile telephone, a smartphone, a PDA, and many other consumer electronic devices.
- FIG. 6 is a flow chart of an exemplary method 600 for processing I/O requests by a computing device comprising multiple CPUs, with the CPUs each embedding a protocol stack-based operating systems.
- the method may commence in operation 610 , when a CPU embedding a MCP (i.e., a Master Core) receives an I/O request.
- the network protocol may be determined. According to various embodiments, the protocol is TCP/IP, UDP/IP, a combination thereof, or the like.
- the Master Core may generate multiple strands (e.g., within processes) according to the I/O requests and the determined (optional in operation 620 ) network protocol.
- the Master Core may schedule and allocate the multiple strands among one or more CPUs 412 , 414 , 416 , 418 (see FIG. 4 ) and other CPUs of the computing device. The allocation of multiple strands may include communicating data via a network interface (e.g., via a bus using I/O interfaces of the CPUs).
- the strands may be processed in the one or more CPUs.
- the processing at each CPU is performed by the state machine that is the operating system, e.g., a TCP/IP stack state machine operating system.
- the operating system may utilize Sockets style API of sockets and ports on IP addresses for handling these strands.
- processing results (e.g., arithmetical or logic results) from multiple CPUs may be assembled by the Master Core for further outputting. According to another example, assembling may be performed within a different CPU, or, alternatively, processing results may be directly transmitted to a corresponding edge device.
- protocol stack-based multiple processors which can be used in different computing devices according to various embodiments disclosed herein.
- a conventional operating system may manage internal tasks and external programs in a dictatorial manner, wherein the appearance of multitasking is achieved through rapid allocation of time slices among multiple strands and processes.
- Such a system may be flexible and of a general purpose.
- applications and unknown driver components have little or no control over their scheduling in such a system.
- the operating system In contrast to a conventional operating system, the operating system according to the various embodiments disclosed herein is essentially a state machine. This results in the whole environment being inherently cooperative and friendly to the operating system as a state machine model. All systems and application components are built together in an open and symbiotic relationship. Only components actually required in a target system are built into the environment.
- the kernel and other systems components include all the normal functions of file and memory management, timers, input and output, TCP/IP, and the like.
- the conventional operating system provides a highly sophisticated and flexible system, but with the downside of a tremendous number of activities (and hence clock cycles and, therefore, energy) going on all the time.
- an implementation according to various embodiments disclosed herein may include only the required components. As a result, execution times and code sizes may be optimized, resulting in fewer energy cycles.
- Such computing device may have a number of state machines handling the operations at a lower level and forwarding data packets up through the TCP/IP stack. When no tasks need to be performed, the state machines are idle. Therefore, the protocol stack-based CPUs according to various embodiments disclosed herein eliminate unnecessary internal clock cycles through the use of intelligent tasking, in contrast to conventional multi-tasking.
- the ultra-low power aspect of the computing device may provide greatly improved battery life for various devices.
- Boot up time for devices may be greatly reduced by executing instructions from the ROM, saving general state information in battery-backed SRAM, and saving crucial microprocessor register setting and other state information in special registers in custom application-specific integrated circuits (ASICs), for example.
- ASICs application-specific integrated circuits
- a full IP stack typically includes an application layer, transport layer, Internet layer, and link layer.
- the basic operating system for the computing device may not normally have all the components of a full IP stack.
- a basic kernel may have, for example, just HTTP on top of TCP on top of IP on top of Ethernet.
- the kernel may be built with SNMP on UDP on IP on Ethernet.
- the computing device may also attempt to identify which sub-processes in a larger process need to be executed sequentially and which sub-processes may be executable in parallel.
- the computing device may provide a model of a set of simple state machines.
- a State Machine Manager (SMM) may be provided to regulate and control the run flow.
- applications register priority and execution parameter requests with the SMM, which in turn handles them accordingly in a fair manner.
- multicore processors are designed first, and thereafter an operating system is designed to run on such processors.
- the operating system design is limited by compromises dictated by the multicore processor design.
- the applications are then designed to run on the operating system.
- the design of the applications is limited by all the limitations dictated by the particular operating system design.
- an operating system may be designed first according to the embodiments described herein. Any unnecessary aspects may be removed for the design. A computing device having multiple CPUs may then be designed. The design process may be iterated to make still further reductions down to the essential components.
- the operating system code executes within a ROM. While saving register contents during a deep sleep, execution within the ROM and as a state machine provide an “instant-on” capability where it may take just milliseconds for the system to resume execution.
- a RAM memory may be used for only truly read-write data that requires it, while the execute-only code may be stored in the ROM.
- the slower access times of ROM devices versus RAM devices may not cause an issue, because the instruction cycle times for the system are generally slow, albeit for a reduced number of cycles.
- Non-volatile media include, for example, optical or magnetic disks, such as a fixed disk.
- Volatile media include dynamic memory, such as system RAM.
- Transmission media include coaxial cables, copper wire, and fiber optics, among others, including the wires that comprise one embodiment of a bus.
- Computer-readable media include, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, any other magnetic medium, a CD-ROM disk, DVD, any other optical medium, any other physical medium with patterns of marks or holes, a RAM, a PROM, an EPROM, an EEPROM, a FLASHEPROM, any other memory chip or cartridge, or any other medium from which a computer can read.
- a bus may carry the data to system ROM (or RAM), from which a CPU retrieves and executes the instructions.
- system ROM or RAM
- the instructions received by system ROM (or RAM) may optionally be stored on a fixed disk either before or after execution by a CPU.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- Debugging And Monitoring (AREA)
Abstract
A computing apparatus and corresponding method for operating are disclosed. The computing apparatus may comprise a set of interconnected central processing units (CPUs). Each CPU may embed an operating system including a kernel comprising a protocol stack. At least one of the CPUs may further embed executable instructions for allocating multiple strands among the rest of the CPUs. The protocol stack may comprise a Transmission Control Protocol/Internet Protocol (TCP/IP), a User Datagram Protocol/Internet Protocol (UDP/IP) stack, an Internet Control Message Protocol (ICMP) stack or any other suitable Internet protocol. The method for operating the computing apparatus may comprise receiving input/output (I/O) requests, generating multiple strands according to the I/O requests, and allocating the multiple strands to one or more CPUs.
Description
This application is a continuation of U.S. patent application Ser. No. 13/224,938, filed on Sep. 2, 2011, entitled “Massively Multicore Processor and Operating System to Manage Strands in Hardware,” which is incorporated by reference in its entirety. This application is also related to U.S. patent application Ser. No. 13/277,111, filed on Oct. 19, 2011, entitled, “TCP/IP Stack-Based Operating System,” which is a continuation of U.S. patent application Ser. No. 12/938,290, filed on Nov. 2, 2010, entitled, “TCP/IP Stack-Based Operating System,” both of which are incorporated by reference in their entirety.
TECHNICAL FIELDThe application generally relates to computing devices having multiple processors and, more specifically, to a multicore processor and operating system based on a protocol stack.
BACKGROUNDComputing devices such as desktop computers, laptop computers, cell phones, smartphones, personal digital assistants (PDA), and many other electronic devices are widely deployed. The primary element of such computing devices is a central processing unit (CPU), or a processor, which is responsible for executing instructions of one or more computer programs. The CPU executes each program instruction in sequence to perform the basic arithmetical, logical, and input/output operations of the computing device. Design and implementation of such devices in general, and CPUs in particular, may vary; however, their fundamental functionalities remain very similar.
Traditionally, in a computing device, the CPU is coupled to a memory and an Input/Output (I/O) subsystem, directly or through a bus, to perform the main functions of computing devices such as inputting and outputting data, processing data, and so forth. The memory may embed an operating system (OS), computer programs, applications, and so forth.
Conventional operating systems are quite similar in architecture, in that each tends to have conventional file and memory operations, storage and graphical user interface operations, and so forth. Architectures of conventional operating systems include a layered design, device drivers, and Application Programming Interfaces (APIs).
In conventional operating systems, a core kernel essentially has master control over all the operations of the overlying software, components, device drivers, applications, and so forth. Traditionally, operating systems implement ‘multi-tasking’ through time slicing and sequential allocation of computer resources to various threads and processes. A thread generally runs within a process and shares resources, e.g., memory, with other threads within the same process, whereas a process generally runs ‘self-contained’ within its own right and completely independently of any other process. In multi-tasking, when a computing device includes a single processor, the operating system instructs the processor to switch between different threads and implement them sequentially. Switching generally happens frequently enough that the user may perceive the threads (or tasks) as running simultaneously.
Many conventional computing devices utilize multiprocessors, or multicore processors, which may truly allocate multiple threads or tasks to run at the same time on different cores. However, conventional multicore processor architectures involve a small number of cores (typically 2, 4, 6, or 8 cores) due to the design limitations of traditional hardware and traditional operating systems. In the case of a conventional multicore processor, the computing device still must implement time slicing and switching between different threads on each of its cores when performing several tasks involving multithreading allocated through the cores. In other words, even conventional multicore processors cannot implement true multitasking.
Traditional processor architectures are also known to experience hanging, cycling, or crashing of the threads when applications are poorly written or purposely malicious. In many instances, a thread crash may bring the whole processor down and result in time-division multiplexing of various threads or processes.
Conventional processor designs use a fixed-frequency, continuously running crystal as the timing mechanism for clocking through microprocessor execution cycles. Thus, the crystal and the processor may continue running even if nothing is being accomplished in the computing device, uselessly cycling around and waiting for a process to actually perform an action. This timing paradigm results in wasted energy. First, the crystal and processor transistors typically execute at their maximum speed at all times, thereby consuming excess power and generating excess heat. Secondly, it is inefficient to continue running clock cycles if no substantive process is actually running. However, these inefficiencies are unavoidable in the conventional operating system design.
Furthermore, conventional operating systems require various modifications and enhancements each year, such as incorporation of new communications layers for Ethernet drivers, Transmission Control Protocol/Internet Protocol (TCP/IP) stacks, Web browsers, and the like. Generally, these new layers are added on top of the conventional operating system, thereby increasing complexity, decreasing performance, and often leading to software crashes and security flaws.
SUMMARYThis summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
In accordance with various embodiments disclosed herein, a computing device having multiple CPUs interconnected to each other is provided. Each CPU embeds an operating system of an entirely new architecture. This operating system may be based fundamentally around an Internet stack, for example, the TCP/IP stack (instead of including a TCP/IP layer as in a conventional core operating system) and may utilize a conventional interface or similar extensions of the standard Berkeley Sockets (or WinSock) APIs.
In accordance with various embodiments disclosed herein, a computing apparatus is provided. The computing apparatus may comprise a set of interconnected central processing units. Each CPU may embed an operating system (OS) comprising an operating system kernel, the operating system kernel being a state machine and comprising a protocol stack. At least one of the CPUs may further embed executable instructions for allocating multiple strands to one or more other CPUs of the set of interconnected CPUs. It will be understood that a strand, as used herein, is a hardware oriented process and is not necessarily similar to a conventional unit of processing (i.e., a thread) that can be scheduled by an operating system. The Internet stack is a set of communication protocols used for the Internet and other similar networks. In one example embodiment, the Internet stack may comprise a TCP/IP stack such that the OS kernel is a TCP/IP stack state machine with proprietary extensions that can be used to change or access internals of the TCP/IP stack state machine. In another example embodiment, the Internet stack may comprise a User Datagram Protocol/Internet Protocol (UDP/IP) stack such that the OS kernel is a UDP/IP stack state machine with proprietary extensions that can be used to change or access internals of the UDP/IP stack state machine. The CPU may comprise a processing unit, a memory and an I/O interface. Executable instructions for the operating system may be stored within one or more types of storage media, such as for example, Read-Only Memory (ROM), Programmable Read-Only Memory (PROM), Field Programmable Read-Only Memory (FPROM), One-Time Programmable Read-Only Memory (OTPROM), One-Time Programmable Non-Volatile Memory (OTP NVM), Erasable Programmable Read-Only Memory (EPROM), and Electrically Erasable Programmable Read-Only Memory (EEPROM or Flash ROM).
The computing apparatus may further comprise at least one asynchronous clock to serve as an internal clock for the operating system. The asynchronous clock may be configurable to automatically stop when clock cycles are no longer needed. A time reference for the operating system kernel may be based, for example, on a Network Time Protocol (NTP), Simple Network Time Protocol (SNTP), or other suitable time protocol from a remote time server. In an example, the operating system may utilize a Sockets style API of sockets and ports on IP addresses for handling I/O requests. The set of CPUs may be interconnected through a bus. Executable instructions for the operating system may be executed through a Sockets API. The at least one CPU that embeds executable instructions for allocating multiple strands may further comprise instructions for generating multiple strands.
According to another embodiment, a method for operating a computing apparatus is provided. The method may comprise receiving I/O requests, generating multiple strands according to the I/O requests, allocating the multiple strands to one or more CPUs of a set of CPUs, and processing the multiple strands. Each CPU may embed an operating system (OS) having a kernel comprising a protocol stack.
According to various embodiments, the I/O requests may be received by a CPU, which embeds executable instructions for allocating multiple strands through multiple CPUs. Allocating multiple strands may comprise communicating data via a network interface.
In one embodiment, the method may further comprise assembling results of multiple strands processing. Executable instructions for the operating system may be stored in a memory and executed through a Sockets API.
According to some embodiments, a non-transitory computer-readable storage medium is provided having embodied instructions thereon, instructions executable by a processor in a computing device to perform a method. The method may comprise receiving an input/output (I/O) request, generating one or more strands according to the I/O request, allocating the one or more strands and/or processes to one or more central processing units (CPUs) of a set of CPUs, wherein each CPU of the set embeds an operating system (OS) having a kernel comprising a protocol stack, and processing the one or more strands and/or processes.
BRIEF DESCRIPTION OF THE DRAWINGSEmbodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements.
is a block diagram of a CPU, according to various exemplary embodiments.
illustrates an exemplary architecture of an Internet stack state machine-based system, according to various embodiments.
is a flow chart illustrating a method for a CPU embedding a protocol stack-based operating system, according to an exemplary embodiment.
is a block scheme of a computing device, according to various exemplary embodiments.
is a computing environment, according to various exemplary embodiments.
is a flow chart of a method for processing I/O requests by a computing device comprising multiple CPUs with embedded Internet stack-based operating systems, according to an exemplary embodiment.
Various aspects of the subject matter disclosed herein are now described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more aspects. It may be evident, however, that such aspects may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to facilitate describing one or more aspects.
Various embodiments disclosed herein relate to computing devices comprising a set of interconnected CPUs. The number of the CPUs is not limited, and may be more than 100, or even more than 10,000, depending on specific application of the computing devices. The CPUs may be interconnected (e.g., through one or more buses) so that multiple strands, processes, and tasks can be allocated among a few or even all CPUs, thereby implementing parallelism or true multi-tasking. According to some embodiments, each of some or all of the CPUs is allocated a respective strand.
As used herein, the term “central processing unit” relates to a processor, a microprocessor, a controller, a microcontroller, a chip, or other processing device that carries out arithmetic and logic instructions of an operating system, a computer program, an application, or the like. According to various embodiments disclosed herein, the CPU comprises a processing unit (typically including an arithmetic logic unit and a control unit) and a memory (also known as “registers,” or Read Only Memory (ROM)). In some embodiments, the CPU may further comprise an I/O Subsystem (Interface) to allow data transfer between the CPU and any other devices such as another CPU or I/O devices such as a keyboard, mouse, printer, monitor, network controller, and so forth.
The CPU memory may store an operating system based entirely on a protocol stack. A protocol stack, as used herein, is a particular software implementation of a computer networking protocol suite. The protocol stack may be a TCP/IP stack, UDP/IP stack, Internet Control Message Protocol (ICMP) stack, combinations thereof, or other protocols. The operating system embedded in the CPU is fundamentally a state machine. The kernel of the operating system is fundamentally a protocol stack.
Such an operating system is inherently Internet-oriented and all Internet type functionality is natural and inherent in its protocol stack-based processor design and implementation. In addition, such an operating system may operate within small hardware, be run by very compact and efficient software, possess minimal clock cycles for execution, have a natural Internet connectivity model and ultra low power consumption.
illustrates a block diagram of an
exemplary CPU100. The
CPU100 may be a processor, a microprocessor, a chip, or the like. The
CPU100 may include a
memory110, which may embed an operating system and, optionally, further software applications. The operating system may comprise a kernel to provide communications between software and hardware components/modules. The kernel may be a state machine with extensions and may comprise an Internet stack. The Internet stack may include a set of communication protocols used for the Internet and similar networks. For example, the Internet stack may include a TCP/IP stack so that the OS kernel is a TCP/IP stack state machine. According to another example, the Internet stack includes a UDP/IP stack such that the OS kernel is a UDP/IP stack state machine. According to yet another example, the Internet stack includes a ICMP stack such that the OS kernel is a ICMP stack state machine.
The
memory110 may store one or more modules. Exemplary modules, which may be stored in the
memory110, include an I/O
request receiver module120, a
protocol handling module130, an I/O
request processing module140, and an optional
network interface module150. It will be appreciated by one skilled in the art that the technology described herein encompasses those embodiments where one or more of the modules may be combined with each other or not included in the
memory110 at all.
The
CPU100 may further include a
processing unit160 for executing various instructions and running modules stored in the
memory110. The
processing unit160 may comprise an arithmetic logic to carry out mathematical functions, and a control unit to regulate data flow through the
processing unit160 and the
CPU100. Those skilled in the art would understand that any suitable architecture of the
processing unit160 is applicable.
A module should be generally understood as one or more applications (routines) that perform various system-level functions and may be dynamically loaded and unloaded by hardware and device drivers as required. The modular software components described herein may also be integrated as part of an application specific component.
According to various embodiments, the modules may each include executable instructions for the operating system embedded into
CPU100 and may be executed through a Sockets API.
The I/O
request receiver module120 may be configured to receive I/O requests. The requests may be from an application residing in an application layer of a computing device (as described in further detail with respect to
FIG. 2).
The
protocol handling module130 may be configured to handle a specific protocol for the protocol stack state machine implementation. For example, the protocol may be a TCP/IP stack such that the operating system is a TCP/IP stack state machine. In some embodiments, the protocol stack may include a different protocol stack (e.g., a UDP/IP stack or ICMP stack which may be used in addition to or in place of the TCP/IP stack).
The operating system may utilize Sockets style API of sockets and ports on IP addresses for handling I/O requests. The I/O
request processing module140 may be configured to process the I/O requests from an application according to the network protocol using the operating system.
The optional
network interface module150 may be included and is configured to provide an interface between the protocol stack state machine and a network interface. The corresponding network interface may be a hardware unit or a “soft” Ethernet controller.
The
CPU100 may also comprise a clock. The
CPU100 may require a clock to drive the state transitions as the
CPU100, for instance, reads and decodes opcodes. Conventionally this is done by some external oscillator circuitry, typically driven by a fixed-frequency crystal. However, clocking may also be done by more than one crystal, e.g. a high frequency crystal (e.g., 50 MHz) one for the main CPU core, and other (lower frequency) crystals for other uses, e.g., programmable timers, watchdog timers etc. Also, a system comprising for instance a Universal Asynchronous Receiver/Transmitter (UART) and a Network Interface Controller (NIC) also typically require clock inputs of some sort. For instance, a UART may need a reliable clock source all the way from perhaps 300 baud up to 921,600 baud. A NIC running 100 MBit Ethernet would typically need a clock source of 50 MHz or 25 MHz.
Typically, a computer system needs to keep track of time, and can do so using internal counters to keep track of its internal clocks. However, in the case of an Internet-connected device, such as in various embodiments described herein, the device is connected to the Internet and thus has readily available external time sources, for instance from Network Time Protocol (NTP), Simple Network Time Protocol (SNTP), or other suitable time protocols from a remote server (i.e., time protocol servers). For
CPU100, the
processing unit160 that may be included may utilize a time reference using the NTP, SNTP, or other suitable time protocol from a remote time server. Alternatively, the Precision Time Protocol (PTP) can be used for synchronization within a Local Area Network (LAN).
According to some example embodiments, an asynchronous (variable) clock may serve as an internal clock for the operating system for the
CPU100. The asynchronous clock may be configurable to automatically stop when clock cycles are no longer needed. The asynchronous system clock may be restarted by a wake-up “daemon” signal from the SNMP daemon (for example, an incoming data packet).
Furthermore, a combination of the above-mentioned clocking approaches can be used. For example, in the initial phases, the internal clock may be used to trigger the
CPU100. The internal clock may be utilized until the
CPU100 is fully active, at which time most or all of the clock requirements may be transitioned to external time protocols, e.g., using Internet time servers using NTP, SNTP, or other suitable time protocols from a remote time server, or using PTP and SNMP to take over the control of the clocking operations. This would mean that internal clock circuitry for
CPU100 could be turned off, thus conserving power.
Executable instructions for the
CPU100 may be optimized to be more efficient than conventional CPUs so that much lower clock rates are used. A self-adjusting cycle rate may be provided depending on the load and function to be performed. In addition, self-learning or predetermined algorithms for expected scenarios may be utilized to put the
CPU100 into a ‘sleep’ or ‘doze’ mode. An expected external event may cause the
CPU100 to exit the doze mode, resume full speed operation to execute necessary operations and handle the external event, and return back to doze. In a doze or a deep sleep mode, the CPU register contents may be read and stored in special registers with long deep-sleep data maintaining capabilities. Such clock saving measures may yield substantial power savings.
illustrates an
exemplary architecture200 for a TCP/IP stack state machine-based system, according to various embodiments. The operating system kernel may include various components operating between
applications210 and
hardware220. The kernel may include a
TCP stack232,
UDP stack236, and/or
ICMP stack240, around which the operating environment may be built. The kernel may include
TCP extensions230,
UDP extensions234,
ICMP extensions238, which together with the
respective TCP stack232, the
UDP stack236, and the
ICMP stack240 are shown above an
IP layer250. The kernel may include one or
more device drivers260, 262, and 264, as well as an
Ethernet controller270.
The API for all operations of the operating system may include the conventional Berkeley Sockets style API of sockets and ports on IP addresses. The Berkeley Sockets may specify the data structures and function calls that interact with the network subsystem of the operating system. The kernel may handle the normal Sockets APIs. The
Sockets API280 may also include some optimized APIs.
Any non-conventional functions (i.e., outside the normal functions used to communicate over the Internet) may be handled in a similar manner (e.g., by opening sockets and binding to ports). Thus, accessing of local input and output (e.g., keyboards, mice, and display screens) may be accomplished through socket/port operations. Consequently, it is quite transparent as to whether a device is local or remote. A keyboard could be at a local host at, for example, 127.0.0.1, or remote at another IP address. Though this transparency may be an aspect of other operating systems, it may not be inherent in the operating system design from the outset. Accordingly, the size of a basic kernel may be very small in a minimal configuration, perhaps as small as a few hundred bytes. It will be understood that the Windows Sockets technology above is mentioned merely for the purpose of providing an example. In contrast to the present technology, in the Windows Sockets technology communications with a display device over the Internet may be cumbersome.
is a flow chart illustrating an
exemplary method300 for a CPU embedding a protocol stack-based operating system. The
method300 may commence at
operation310 with receiving an I/O request. The request may be from an application residing in an
applications layer210 of a computing device. In
operation320, the network protocol may be determined. According to some embodiments, the protocol is TCP/IP, so that the operating system is a TCP/IP stack state machine. In some other embodiments, the protocol is UDP/IP. UDP is an unreliable connectionless protocol sitting on top of IP, and TCP is a connection-oriented reliable protocol. The protocol may be a hybrid of TCP and UDP, wherein a data connection stream includes a mixture of UDP and TCP packets. UDP has less overhead and is suitable for lower-importance information, whereas TCP has a higher overhead but essentially guarantees delivery. For instance, a stream of data comprising non-essential information (such as low-importance data) mixed with critical data could better be transmitted over such a hybrid link. This hybrid protocol may be determined in
operation320.
In
operation330, the I/O request may be processed according to the network protocol. The processing may be performed by the state machine that is the operating system (e.g., a TCP/IP stack state machine operating system). The operating system may utilize a Sockets style API of sockets and ports on IP addresses for handling I/O requests. The conventional Berkeley Sockets style API of sockets and ports on IP addresses may be used. The Berkeley Sockets may specify the data structures and function calls that interact with the network subsystem of the operating system.
is a block scheme of a
computing device400, according to an exemplary embodiment. The
computing device400 may comprise five
CPUs410, 412, 414, 416, and 418. Despite the fact that five CPUs are shown, it will be appreciated by one skilled in the art that any number of CPUs may be used in the
computing device400. Some embodiments may include up to 10,000 CPUs or even more.
The
CPUs410, 412, 414, 416, and 418 may all be coupled to a
bus line420 so that they may communicate data amongst each other. According to various embodiments disclosed herein, each CPU embeds an operating system based on a protocol stack. The protocol stack may be a TCP/IP protocol stack, UDP/IP stack, combinations thereof (i.e., hybrid stack), or other appropriate protocols. One particular example of the CPU embedding a TCP/IP stack-based operating system is described with reference to
FIG. 1.
Although not shown in
FIG. 4, the
CPUs410, 412, 414, 416, and 418 may each include a memory storing an operating system and/or any further executable instructions and/or data. The memory can be implemented within the CPU or externally. In one example, all
CPUs410, 412, 414, 416, and 418 may share a single memory coupled to the
bus420. As used herein, the term “memory” refers to any type of long term, short term, volatile, nonvolatile, or other storage devices and is not limited to any particular type of memory or number of memories, or type of media upon which memory is stored.
The
CPUs410, 412, 414, 416, and 418 in the example in
FIG. 4may further comprise an I/O Interface (not shown) implemented as software and/or hardware. One particular example of software implementation of the I/O Interface is shown as optional
Network Interface Module150 in
FIG. 1. Alternatively, a hardware implementation may comprise an I/O controller, a Network Interface Controller (NIC) as an Ethernet controller, or the like. It will be apparent to those skilled in the art that the I/O interface may support any communication standards and provide communications over a serial connection, parallel connection, firewire connection, Ethernet connection, and so forth.
Each of the CPUs may further comprise a clock (not shown), which can be implemented within each CPU or externally. According to various embodiments, a single clock may be shared by all CPUs.
One or more of the CPUs may embed a Master Control Program (MCP) 430. According to the example in
FIG. 4, the
CPU410 embeds the
MCP430. The
MCP430 is an application or a routine for managing operations of the remaining
CPUs412, 414, 416, and 418 and, therefore, the
CPU410 may be considered a “Master Core.” More specifically, the
MCP430 may be configured to receive I/O requests from outside devices, generate multiple strands (processes, tasks) according to the I/O requests, and allocate these strands (processes) to the
other CPUs412, 414, 416, and 418 so that the overall computational load is selectively distributed among the
CPUs412, 414, 416, and 418. However, in some embodiments, strands may be allocated to some of the
CPUs412, 414, 416, and 418, or to just one CPU. According to some embodiments, each of a number of CPUs (i.e., one, some, or all of the CPUs) is each allocated a respective strand. After execution of all strands and/or processes allocated to different CPUs, the results of the computations may be assembled in the Master Core for further outputting. Alternatively, the
CPUs412, 414, 416, and 418 may deliver results directly to corresponding external devices. According to some embodiments, the
computing device400 may comprise several Master Cores for processing different types of I/O requests. In yet another embodiment, one Master Core may process all incoming I/O requests, while other Master Cores may be utilized for assembling the output of multiple CPUs, and transmitting of the assembled output results to corresponding outside devices. Those who are skilled in the art would readily understand that any possible number of Master Cores is possible, and each Master Core may implement the same or different functions.
According to various exemplary embodiments, whenever a strand or process is ‘created’ (e.g., by a typical ‘C’ ‘CreateThread( . . . ) function call), the MCP physically allocates a hardware core stack to the strand (or process). An allocated core stack/strand combination may also be referred to as a “core strand”. The cores (or core strands) may form a massive array in which core strands may be wired as a block to share resources (e.g., memory), or allowed to share the resources over their interconnects. Cores in the (massive) array of cores may be connected to each other, e.g., interconnected by a web-like structure. Cores may be allocated processes in some embodiments, i.e., cores which are processes or “process cores”. Such exemplary process cores are naturally isolated from other process cores since processes run independently of other processes, each process containing their own resources, in contrast to strands where resources may be shared therebetween.
The
computing device400 allows only a certain number of CPUs to operate while the remaining CPUs, not involved in the processing, are turned off. For example, the
computing device400 may comprise 1,000 CPUs and a single Master Core. In response to the I/O request, the Master Core may generate 600 strands (variously within a number of processes) and allocate them to 600 CPUs. The remaining 400 CPUs may be turned off to conserve power. If another 100 strands later become needed, 100 of the 400 CPUs may be turned on in response to the allocating of the 100 strands to them so that the total number of the CPUs executing instructions becomes 700. As clearly shown in this example, the overall power consumption is reduced compared to the traditional system where all processors run all the time, even if there is no process or strand to execute.
The
computing device400 may facilitate greater stability of operations when compared to conventional multicore processors. When one of the strands crashes, for example, due to a poorly written routine or for some other reason, only the CPU running the strand is affected, while other CPUs remain unaffected. This is in contrast to conventional systems where the entire multicore processor may become affected by a single strand crash.
illustrates an exemplary embodiment of a
computing environment500. The
computing environment500 may comprise a computing device 510 (which is described in greater detail with reference to
FIG. 4), a
memory520, a
clock530, and
communication ports540, all of which may be coupled to a
bus550.
The
memory520 may include any memory configured to store and retrieve data. Some examples of the
memory520 include storage devices, such as a hard disk, magnetic tape, any other magnetic medium, a CD-ROM disk, digital video disk (DVD), any other optical medium, any other physical medium with patterns of marks or holes, a RAM, a ROM, a PROM, an EPROM, an EEPROM, a FLASHEPROM, OTPROM, OTP NVM, Flash ROM or any other memory chip or cartridge, or any other medium from which a computer can read instructions. The
memory520 may comprise a data structure configured to hold and organize data. The
memory520 may comprise executable instructions of the operating system and/or other routines and applications. The
memory520 may also comprise a MCP, as described above with reference to
FIG. 4.
The
clock530 may serve as an asynchronous clock for the operating system for one or more CPUs of the
computing device510. The asynchronous clock may be configured to automatically stop when clock cycles are not needed.
540 represent a connection interface that allows asynchronous transmission of data between the
computing environment500 and any edge devices such as a keyboard, mouse, monitor, printer, CD-ROM drive, network controller, and so forth.
The
computing environment500 may be implemented as a desktop computer, a laptop computer, a mobile telephone, a smartphone, a PDA, and many other consumer electronic devices.
is a flow chart of an
exemplary method600 for processing I/O requests by a computing device comprising multiple CPUs, with the CPUs each embedding a protocol stack-based operating systems.
The method may commence in
operation610, when a CPU embedding a MCP (i.e., a Master Core) receives an I/O request. In
optional operation620, the network protocol may be determined. According to various embodiments, the protocol is TCP/IP, UDP/IP, a combination thereof, or the like. In
operation630, the Master Core may generate multiple strands (e.g., within processes) according to the I/O requests and the determined (optional in operation 620) network protocol. In
operation640, the Master Core may schedule and allocate the multiple strands among one or
more CPUs412, 414, 416, 418 (see
FIG. 4) and other CPUs of the computing device. The allocation of multiple strands may include communicating data via a network interface (e.g., via a bus using I/O interfaces of the CPUs).
In
operation650, the strands (or alternatively the processes which contain strands) may be processed in the one or more CPUs. According to various embodiments, the processing at each CPU is performed by the state machine that is the operating system, e.g., a TCP/IP stack state machine operating system. The operating system may utilize Sockets style API of sockets and ports on IP addresses for handling these strands.
In
optional operation660, processing results (e.g., arithmetical or logic results) from multiple CPUs may be assembled by the Master Core for further outputting. According to another example, assembling may be performed within a different CPU, or, alternatively, processing results may be directly transmitted to a corresponding edge device.
The following provides an overview of the functionalities facilitated by protocol stack-based multiple processors, which can be used in different computing devices according to various embodiments disclosed herein.
A conventional operating system may manage internal tasks and external programs in a dictatorial manner, wherein the appearance of multitasking is achieved through rapid allocation of time slices among multiple strands and processes. Such a system may be flexible and of a general purpose. However, applications and unknown driver components have little or no control over their scheduling in such a system.
In contrast to a conventional operating system, the operating system according to the various embodiments disclosed herein is essentially a state machine. This results in the whole environment being inherently cooperative and friendly to the operating system as a state machine model. All systems and application components are built together in an open and symbiotic relationship. Only components actually required in a target system are built into the environment.
In a conventional operating system, the kernel and other systems components include all the normal functions of file and memory management, timers, input and output, TCP/IP, and the like. There are numerous strands and processes going on, such as kernel executive cycles around all the running processes, updating clocks, checking communication ports, updating displays, checking on Ethernet traffic, and so forth. As such, the conventional operating system provides a highly sophisticated and flexible system, but with the downside of a tremendous number of activities (and hence clock cycles and, therefore, energy) going on all the time.
In contrast, an implementation according to various embodiments disclosed herein may include only the required components. As a result, execution times and code sizes may be optimized, resulting in fewer energy cycles. Such computing device may have a number of state machines handling the operations at a lower level and forwarding data packets up through the TCP/IP stack. When no tasks need to be performed, the state machines are idle. Therefore, the protocol stack-based CPUs according to various embodiments disclosed herein eliminate unnecessary internal clock cycles through the use of intelligent tasking, in contrast to conventional multi-tasking.
The ultra-low power aspect of the computing device according to the embodiments disclosed herein may provide greatly improved battery life for various devices. Boot up time for devices may be greatly reduced by executing instructions from the ROM, saving general state information in battery-backed SRAM, and saving crucial microprocessor register setting and other state information in special registers in custom application-specific integrated circuits (ASICs), for example.
A full IP stack typically includes an application layer, transport layer, Internet layer, and link layer. The basic operating system for the computing device may not normally have all the components of a full IP stack. A basic kernel may have, for example, just HTTP on top of TCP on top of IP on top of Ethernet. Alternatively, the kernel may be built with SNMP on UDP on IP on Ethernet. Those who are skilled in the art would readily understand that various possible implementations are possible.
The computing device may also attempt to identify which sub-processes in a larger process need to be executed sequentially and which sub-processes may be executable in parallel. The computing device may provide a model of a set of simple state machines. In complex systems, a State Machine Manager (SMM) may be provided to regulate and control the run flow. In operation, applications register priority and execution parameter requests with the SMM, which in turn handles them accordingly in a fair manner.
Conventionally, multicore processors are designed first, and thereafter an operating system is designed to run on such processors. As a result, the operating system design is limited by compromises dictated by the multicore processor design. The applications are then designed to run on the operating system. The design of the applications is limited by all the limitations dictated by the particular operating system design.
In contrast to this conventional design process, an operating system may be designed first according to the embodiments described herein. Any unnecessary aspects may be removed for the design. A computing device having multiple CPUs may then be designed. The design process may be iterated to make still further reductions down to the essential components.
According to various embodiments, the operating system code executes within a ROM. While saving register contents during a deep sleep, execution within the ROM and as a state machine provide an “instant-on” capability where it may take just milliseconds for the system to resume execution. A RAM memory may be used for only truly read-write data that requires it, while the execute-only code may be stored in the ROM. The slower access times of ROM devices versus RAM devices may not cause an issue, because the instruction cycle times for the system are generally slow, albeit for a reduced number of cycles.
The terms “computer-readable storage medium” and “computer-readable storage media” as used herein refer to any medium or media that participate in providing instructions to a CPU for execution. Such media can take many forms, including, but not limited to, non-volatile media, volatile media and transmission media. Non-volatile media include, for example, optical or magnetic disks, such as a fixed disk. Volatile media include dynamic memory, such as system RAM. Transmission media include coaxial cables, copper wire, and fiber optics, among others, including the wires that comprise one embodiment of a bus. Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, any other magnetic medium, a CD-ROM disk, DVD, any other optical medium, any other physical medium with patterns of marks or holes, a RAM, a PROM, an EPROM, an EEPROM, a FLASHEPROM, any other memory chip or cartridge, or any other medium from which a computer can read.
Various forms of computer-readable media may be involved in carrying one or more sequences of one or more instructions to a CPU for execution. A bus may carry the data to system ROM (or RAM), from which a CPU retrieves and executes the instructions. The instructions received by system ROM (or RAM) may optionally be stored on a fixed disk either before or after execution by a CPU.
The above description is illustrative and not restrictive. Many variations of the embodiments will become apparent to those of skill in the art upon review of this disclosure. The scope of the subject matter should, therefore, be determined not with reference to the above description, but instead should be determined with reference to the appended claims along with their full scope of equivalents.
While the present embodiments have been described in connection with a series of embodiments, these descriptions are not intended to limit the scope of the subject matter to the particular forms set forth herein. It will be further understood that the methods are not necessarily limited to the discrete steps or the order of the steps described. To the contrary, the present descriptions are intended to cover such alternatives, modifications, and equivalents as may be included within the spirit and scope of the subject matter as disclosed herein and defined by the appended claims and otherwise appreciated by one of ordinary skill in the art.
Claims (24)
1. A computing apparatus, comprising:
a set of interconnected central processing units (CPUs), each of the CPUs embedding an operating system (OS), the OS comprising an operating system kernel, the operating system kernel being a state machine based on a network protocol stack; and
at least one of the CPUs further embedding executable instructions for allocating multiple strands to one or more other CPUs of the set of interconnected CPUs.
2. The apparatus of
claim 1, wherein the one or more other CPUs includes all other CPUs of the set such that the at least one of the CPUs embeds executable instructions for allocating multiple strands to all other CPUs of the set of interconnected CPUs.
3. The apparatus of
claim 1, wherein the one or more other CPUs includes less than all of the other CPUs of the set, any of the CPUs not allocated strands being turned off to conserve power.
4. The apparatus of
claim 1, wherein the one or more other CPUs includes less than all of the other CPUs of the set, and wherein any of the CPUs not allocated strands are placed in a sleep mode to conserve power.
5. The apparatus of
claim 1, wherein the network protocol stack comprises a User Datagram Protocol/Internet Protocol (UDP/IP) stack such that the OS is a UDP/IP stack state machine or Internet Control Message Protocol (ICMP) stack such that the OS is ICMP stack.
6. The apparatus of
claim 1, wherein each of the CPUs comprises a processing unit, a memory and an Input/Output (I/O) interface.
7. The apparatus of
claim 6, wherein the memory includes one or more of the following memory types: a Read-Only Memory (ROM), Programmable Read-Only Memory (PROM), Field Programmable Read-Only Memory (FPROM), One-Time Programmable Read-Only Memory (OTPROM), One-Time Programmable Non-Volatile Memory (OTPNVM), Erasable Programmable Read-Only Memory (EPROM), and Electrically Erasable Programmable Read-Only Memory (EEPROM or Flash ROM), the executable instructions for the OS being stored within the one or more memory types wherein all operations for the OS are executed using a Sockets applications programming interface (API).
8. The apparatus of
claim 1, further comprising at least one asynchronous clock to serve as an internal clock for the OS.
9. The apparatus of
claim 8, wherein the asynchronous clock is configurable to automatically stop when clock cycles are no longer needed.
10. The apparatus of
claim 1, wherein a time reference for the OS kernel is based on a Network Time Protocol (NTP), Simple Network Time Protocol (SNTP), or a Precision Time Protocol (PTP).
11. The apparatus of
claim 1, wherein the set of interconnected CPUs are interconnected through a bus.
12. The apparatus of
claim 1, wherein executable instructions for the operating system are executed through a Sockets applications programming interface (API).
13. The apparatus of
claim 1, wherein the OS utilizes a Sockets style API of sockets and ports on Internet Protocol (IP) addresses for handling I/O requests.
14. The apparatus of
claim 1, wherein the at least one CPU embedding executable instructions for allocating multiple strands further comprises instructions for generating multiple strands.
15. The apparatus of
claim 1, where the set of interconnected CPUs comprises 1000 interconnected CPUs.
16. A method, comprising:
receiving an input/output (I/O) request;
generating one or more strands according to the I/O request;
allocating the one or more strands to one or more central processing units (CPUs) of a set of CPUs, wherein each CPU of the set embeds an operating system (OS) having a kernel based on a network protocol stack; and
processing the one or more strands.
17. The method of
claim 16, wherein any of the CPUs not allocated at least one of the strands is turned off to conserve power.
18. The method of
claim 16, wherein the network protocol stack comprises a Transmission Control Protocol/Internet Protocol (TCP/IP) stack such that the OS is a TCP/IP stack state machine.
19. The method of
claim 16, wherein the network protocol stack comprises a User Datagram Protocol/Internet Protocol (UDP/IP) stack such that the OS is a UDP/IP stack state machine or Internet Control Message Protocol (ICMP) stack such that the OS is ICMP stack.
20. The method of
claim 16, wherein at least one of the CPUs of the set of CPUs receives I/O requests, the at least one CPU embedding executable instructions for allocating the multiple strands to a number of the other CPUs of the set of CPUs.
21. The method of
claim 16, wherein allocating comprises communicating data via a network interface.
22. The method of
claim 16, further comprising assembling results of the processing.
23. The method of
claim 16, wherein executable instructions for the operating system are stored in one or more of the following memory types: Read-Only Memory (ROM), Programmable Read-Only Memory (PROM), Field Programmable Read-Only Memory (FPROM), One-Time Programmable Read-Only Memory (OTPROM), One-Time Programmable Non-Volatile Memory (OTP NVM), Erasable Programmable Read-Only Memory (EPROM), and Electrically Erasable Programmable Read-Only Memory (EEPROM or Flash ROM), wherein all operations for the OS are executed using a Sockets applications programming interface (API).
24. A non-transitory computer-readable storage medium having embodied instructions thereon, instructions executable by a processor in a computing device to perform a method, the method comprising:
receiving an input/output (I/O) request;
generating one or more strands according to the I/O request;
allocating the one or more strands to one or more central processing units (CPUs) of a set of CPUs, wherein each CPU of the set embeds an operating system (OS), the OS comprising a kernel that is a state machine based on a network protocol stack; and
processing the one or more strands.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/333,802 US8607086B2 (en) | 2011-09-02 | 2011-12-21 | Massively multicore processor and operating system to manage strands in hardware |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/224,938 US8904216B2 (en) | 2011-09-02 | 2011-09-02 | Massively multicore processor and operating system to manage strands in hardware |
US13/333,802 US8607086B2 (en) | 2011-09-02 | 2011-12-21 | Massively multicore processor and operating system to manage strands in hardware |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/224,938 Continuation US8904216B2 (en) | 2011-09-02 | 2011-09-02 | Massively multicore processor and operating system to manage strands in hardware |
Publications (2)
Publication Number | Publication Date |
---|---|
US20130061078A1 US20130061078A1 (en) | 2013-03-07 |
US8607086B2 true US8607086B2 (en) | 2013-12-10 |
Family
ID=47754068
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/224,938 Expired - Fee Related US8904216B2 (en) | 2011-09-02 | 2011-09-02 | Massively multicore processor and operating system to manage strands in hardware |
US13/333,802 Expired - Fee Related US8607086B2 (en) | 2011-09-02 | 2011-12-21 | Massively multicore processor and operating system to manage strands in hardware |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/224,938 Expired - Fee Related US8904216B2 (en) | 2011-09-02 | 2011-09-02 | Massively multicore processor and operating system to manage strands in hardware |
Country Status (3)
Country | Link |
---|---|
US (2) | US8904216B2 (en) |
EP (1) | EP2751700A4 (en) |
WO (1) | WO2013032660A1 (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8875276B2 (en) | 2011-09-02 | 2014-10-28 | Iota Computing, Inc. | Ultra-low power single-chip firewall security device, system and method |
US8904216B2 (en) | 2011-09-02 | 2014-12-02 | Iota Computing, Inc. | Massively multicore processor and operating system to manage strands in hardware |
US9436521B2 (en) | 2009-11-03 | 2016-09-06 | Iota Computing, Inc. | TCP/IP stack-based operating system |
US9858052B2 (en) | 2013-03-21 | 2018-01-02 | Razer (Asia-Pacific) Pte. Ltd. | Decentralized operating system |
US10778406B2 (en) | 2018-11-26 | 2020-09-15 | Mellanox Technologies, Ltd. | Synthesized clock synchronization between networks devices |
US11070304B1 (en) | 2020-02-25 | 2021-07-20 | Mellanox Technologies, Ltd. | Physical hardware clock chaining |
US11283454B2 (en) | 2018-11-26 | 2022-03-22 | Mellanox Technologies, Ltd. | Synthesized clock synchronization between network devices |
US11483127B2 (en) | 2018-11-18 | 2022-10-25 | Mellanox Technologies, Ltd. | Clock synchronization |
US11543852B2 (en) | 2019-11-07 | 2023-01-03 | Mellanox Technologies, Ltd. | Multihost clock synchronization |
US11552871B2 (en) | 2020-06-14 | 2023-01-10 | Mellanox Technologies, Ltd. | Receive-side timestamp accuracy |
US11588609B2 (en) | 2021-01-14 | 2023-02-21 | Mellanox Technologies, Ltd. | Hardware clock with built-in accuracy check |
US11606427B2 (en) | 2020-12-14 | 2023-03-14 | Mellanox Technologies, Ltd. | Software-controlled clock synchronization of network devices |
US11706014B1 (en) | 2022-01-20 | 2023-07-18 | Mellanox Technologies, Ltd. | Clock synchronization loop |
US11835999B2 (en) | 2022-01-18 | 2023-12-05 | Mellanox Technologies, Ltd. | Controller which adjusts clock frequency based on received symbol rate |
US11907754B2 (en) | 2021-12-14 | 2024-02-20 | Mellanox Technologies, Ltd. | System to trigger time-dependent action |
US11917045B2 (en) | 2022-07-24 | 2024-02-27 | Mellanox Technologies, Ltd. | Scalable synchronization of network devices |
US12028155B2 (en) | 2021-11-24 | 2024-07-02 | Mellanox Technologies, Ltd. | Controller which adjusts clock frequency based on received symbol rate |
US12081427B2 (en) | 2020-04-20 | 2024-09-03 | Mellanox Technologies, Ltd. | Time-synchronization testing in a network element |
US12111681B2 (en) | 2021-05-06 | 2024-10-08 | Mellanox Technologies, Ltd. | Network adapter providing isolated self-contained time services |
US12216489B2 (en) | 2023-02-21 | 2025-02-04 | Mellanox Technologies, Ltd | Clock adjustment holdover |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9088612B2 (en) * | 2013-02-12 | 2015-07-21 | Verizon Patent And Licensing Inc. | Systems and methods for providing link-performance information in socket-based communication devices |
US10223156B2 (en) | 2013-06-09 | 2019-03-05 | Apple Inc. | Initiating background updates based on user activity |
US9432796B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Dynamic adjustment of mobile device based on peer event data |
US10979279B2 (en) * | 2014-07-03 | 2021-04-13 | International Business Machines Corporation | Clock synchronization in cloud computing |
US10594835B2 (en) | 2015-06-05 | 2020-03-17 | Apple Inc. | Efficient context monitoring |
FR3047820B1 (en) | 2016-02-16 | 2018-03-16 | Centre National De La Recherche Scientifique (C.N.R.S) | OPERATING SYSTEM FOR SENSOR OF SENSOR ARRAY AND ASSOCIATED SENSOR |
WO2020168505A1 (en) * | 2019-02-21 | 2020-08-27 | 华为技术有限公司 | Method and apparatus for scheduling software tasks among multiple processors |
JP7238561B2 (en) * | 2019-04-11 | 2023-03-14 | 京セラドキュメントソリューションズ株式会社 | Information processing device and packet pattern generation program |
Citations (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5469553A (en) | 1992-04-16 | 1995-11-21 | Quantum Corporation | Event driven power reducing software state machine |
US5493689A (en) | 1993-03-01 | 1996-02-20 | International Business Machines Corporation | System for configuring an event driven interface including control blocks defining good loop locations in a memory which represent detection of a characteristic pattern |
US5710910A (en) | 1994-09-30 | 1998-01-20 | University Of Washington | Asynchronous self-tuning clock domains and method for transferring data among domains |
US5896499A (en) | 1997-02-21 | 1999-04-20 | International Business Machines Corporation | Embedded security processor |
US5968133A (en) | 1997-01-10 | 1999-10-19 | Secure Computing Corporation | Enhanced security network time synchronization device and method |
US20020007420A1 (en) * | 1998-12-18 | 2002-01-17 | Microsoft Corporation | Adaptive flow control protocol |
US20020167965A1 (en) * | 2001-01-18 | 2002-11-14 | James Beasley | Link context mobility method and system for providing such mobility, such as a system employing short range frequency hopping spread spectrum wireless protocols |
US20030084190A1 (en) | 2001-10-25 | 2003-05-01 | Kimball Robert H. | Apparatus and system for maintaining accurate time in a wireless environment |
US20040049624A1 (en) | 2002-09-06 | 2004-03-11 | Oak Technology, Inc. | Network to computer internal interface |
US6714536B1 (en) | 1998-07-21 | 2004-03-30 | Eric M. Dowling | Method and apparatus for cosocket telephony |
US20040093520A1 (en) | 2000-07-03 | 2004-05-13 | Hak-Moo Lee | Firewall system combined with embedded hardware and general-purpose computer |
US20040143751A1 (en) | 2003-01-17 | 2004-07-22 | Cyrus Peikari | Protection of embedded processing systems with a configurable, integrated, embedded firewall |
US20040210320A1 (en) | 2002-06-11 | 2004-10-21 | Pandya Ashish A. | Runtime adaptable protocol processor |
CN1622517A (en) | 2003-11-27 | 2005-06-01 | 上海安创信息科技有限公司 | An embedded information security platform |
US20060026162A1 (en) | 2004-07-19 | 2006-02-02 | Zoran Corporation | Content management system |
US7002979B1 (en) | 2001-08-10 | 2006-02-21 | Utstarcom, Inc. | Voice data packet processing system |
US7036064B1 (en) | 2000-11-13 | 2006-04-25 | Omar Kebichi | Synchronization point across different memory BIST controllers |
US7055173B1 (en) | 1997-12-19 | 2006-05-30 | Avaya Technology Corp. | Firewall pooling in a network flowswitch |
US20060133370A1 (en) | 2004-12-22 | 2006-06-22 | Avigdor Eldar | Routing of messages |
US20070008976A1 (en) | 2001-04-11 | 2007-01-11 | Aol Llc | Local Protocol Server |
US20070022421A1 (en) | 2003-04-09 | 2007-01-25 | Eric Lescouet | Operating systems |
US20070118596A1 (en) * | 2000-12-05 | 2007-05-24 | Microsoft Corporation | System and method for implementing a client side http stack |
US7246272B2 (en) | 2004-01-16 | 2007-07-17 | International Business Machines Corporation | Duplicate network address detection |
US20070211633A1 (en) * | 2006-03-13 | 2007-09-13 | Microsoft Corporation | Competitive and considerate congestion control |
US20070255861A1 (en) | 2006-04-27 | 2007-11-01 | Kain Michael T | System and method for providing dynamic network firewall with default deny |
US7308686B1 (en) | 1999-12-22 | 2007-12-11 | Ubicom Inc. | Software input/output using hard real time threads |
US7334124B2 (en) | 2002-07-22 | 2008-02-19 | Vormetric, Inc. | Logical access block processing protocol for transparent secure file storage |
US20080046891A1 (en) | 2006-07-12 | 2008-02-21 | Jayesh Sanchorawala | Cooperative asymmetric multiprocessing for embedded systems |
US20080109665A1 (en) | 2003-02-14 | 2008-05-08 | International Business Machines Corporation | Network processor power management |
US7509673B2 (en) | 2003-06-06 | 2009-03-24 | Microsoft Corporation | Multi-layered firewall architecture |
US20090126003A1 (en) | 2007-05-30 | 2009-05-14 | Yoggie Security Systems, Inc. | System And Method For Providing Network And Computer Firewall Protection With Dynamic Address Isolation To A Device |
TW200924424A (en) | 2007-11-21 | 2009-06-01 | Inventec Corp | System for intrusion detection system |
US20090158299A1 (en) * | 2007-10-31 | 2009-06-18 | Carter Ernst B | System for and method of uniform synchronization between multiple kernels running on single computer systems with multiple CPUs installed |
US20090235263A1 (en) * | 2008-03-17 | 2009-09-17 | Fujitsu Limited | Job assignment apparatus, job assignment method, and computer-readable medium |
US20100005323A1 (en) * | 2006-06-07 | 2010-01-07 | Yuki Kuroda | Semiconductor integrated circuit |
US7657933B2 (en) | 2003-04-12 | 2010-02-02 | Cavium Networks, Inc. | Apparatus and method for allocating resources within a security processing architecture using multiple groups |
US7694158B2 (en) * | 2005-04-19 | 2010-04-06 | Stmicroelectronics S.R.L. | Parallel processing method and system, for instance for supporting embedded cluster platforms, computer program product therefor |
US20100115116A1 (en) * | 2008-11-03 | 2010-05-06 | Micron Technology, Inc. | System and method for switching communication protocols in electronic interface devices |
US20100131729A1 (en) | 2004-12-21 | 2010-05-27 | Koninklijke Philips Electronics N.V. | Integrated circuit with improved device security |
US7734933B1 (en) | 2005-06-17 | 2010-06-08 | Rockwell Collins, Inc. | System for providing secure and trusted computing environments through a secure computing module |
US20100185719A1 (en) * | 2000-06-26 | 2010-07-22 | Howard Kevin D | Apparatus For Enhancing Performance Of A Parallel Processing Environment, And Associated Methods |
US20100192225A1 (en) | 2009-01-28 | 2010-07-29 | Juniper Networks, Inc. | Efficient application identification with network devices |
US7770179B1 (en) | 2004-01-30 | 2010-08-03 | Xilinx, Inc. | Method and apparatus for multithreading on a programmable logic device |
US20110002184A1 (en) | 2007-10-09 | 2011-01-06 | Samsung Electronics Co., Ltd. | Method of detecting a light attack against a memory device and memory device employing a method of detecting a light attack |
US7886340B2 (en) | 2002-06-13 | 2011-02-08 | Engedi Technologies | Secure remote management appliance |
US20110088037A1 (en) | 2009-10-13 | 2011-04-14 | Roman Glistvain | Single-stack real-time operating system for embedded systems |
US20110107357A1 (en) | 2009-11-03 | 2011-05-05 | Ian Henry Stuart Cullimore | TCP/IP Stack-Based Operating System |
US8055822B2 (en) * | 2007-08-21 | 2011-11-08 | International Business Machines Corporation | Multicore processor having storage for core-specific operational data |
US20120017262A1 (en) | 2000-09-25 | 2012-01-19 | Harsh Kapoor | Systems and methods for processing data flows |
US8132001B1 (en) | 2005-10-19 | 2012-03-06 | Sprint Communications Company L.P. | Secure telephony service appliance |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6507563B1 (en) | 1998-12-24 | 2003-01-14 | Cisco Technology, Inc. | Methods and apparatus for controlling congestion within diverse protocol stacks |
US7076803B2 (en) | 2002-01-28 | 2006-07-11 | International Business Machines Corporation | Integrated intrusion detection services |
US7424710B1 (en) | 2002-12-18 | 2008-09-09 | Vmware, Inc. | TCP/IP offloading for virtual machines |
US20040249957A1 (en) | 2003-05-12 | 2004-12-09 | Pete Ekis | Method for interface of TCP offload engines to operating systems |
US7363369B2 (en) * | 2003-10-16 | 2008-04-22 | International Business Machines Corporation | Monitoring thread usage to dynamically control a thread pool |
US20050278720A1 (en) * | 2004-05-27 | 2005-12-15 | Samsung Electronics Co., Ltd. | Distribution of operating system functions for increased data processing performance in a multi-processor architecture |
US20090217020A1 (en) | 2004-11-22 | 2009-08-27 | Yourst Matt T | Commit Groups for Strand-Based Computing |
KR100646858B1 (en) | 2004-12-08 | 2006-11-23 | 한국전자통신연구원 | Hardware device and behavior manner for creation and management of socket information based on TOE |
US7779164B2 (en) | 2005-04-04 | 2010-08-17 | Oracle America, Inc. | Asymmetrical data processing partition |
US8136124B2 (en) | 2007-01-18 | 2012-03-13 | Oracle America, Inc. | Method and apparatus for synthesizing hardware counters from performance sampling |
JP2008276331A (en) * | 2007-04-25 | 2008-11-13 | Toshiba Corp | Controller for multiprocessor and its method |
US8875276B2 (en) | 2011-09-02 | 2014-10-28 | Iota Computing, Inc. | Ultra-low power single-chip firewall security device, system and method |
US8904216B2 (en) | 2011-09-02 | 2014-12-02 | Iota Computing, Inc. | Massively multicore processor and operating system to manage strands in hardware |
-
2011
- 2011-09-02 US US13/224,938 patent/US8904216B2/en not_active Expired - Fee Related
- 2011-12-21 US US13/333,802 patent/US8607086B2/en not_active Expired - Fee Related
-
2012
- 2012-08-09 EP EP12827760.5A patent/EP2751700A4/en not_active Withdrawn
- 2012-08-09 WO PCT/US2012/050101 patent/WO2013032660A1/en active Application Filing
Patent Citations (52)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5469553A (en) | 1992-04-16 | 1995-11-21 | Quantum Corporation | Event driven power reducing software state machine |
US5493689A (en) | 1993-03-01 | 1996-02-20 | International Business Machines Corporation | System for configuring an event driven interface including control blocks defining good loop locations in a memory which represent detection of a characteristic pattern |
US5710910A (en) | 1994-09-30 | 1998-01-20 | University Of Washington | Asynchronous self-tuning clock domains and method for transferring data among domains |
US5968133A (en) | 1997-01-10 | 1999-10-19 | Secure Computing Corporation | Enhanced security network time synchronization device and method |
US5896499A (en) | 1997-02-21 | 1999-04-20 | International Business Machines Corporation | Embedded security processor |
US7055173B1 (en) | 1997-12-19 | 2006-05-30 | Avaya Technology Corp. | Firewall pooling in a network flowswitch |
US6714536B1 (en) | 1998-07-21 | 2004-03-30 | Eric M. Dowling | Method and apparatus for cosocket telephony |
US20020007420A1 (en) * | 1998-12-18 | 2002-01-17 | Microsoft Corporation | Adaptive flow control protocol |
US7308686B1 (en) | 1999-12-22 | 2007-12-11 | Ubicom Inc. | Software input/output using hard real time threads |
US20100185719A1 (en) * | 2000-06-26 | 2010-07-22 | Howard Kevin D | Apparatus For Enhancing Performance Of A Parallel Processing Environment, And Associated Methods |
US20040093520A1 (en) | 2000-07-03 | 2004-05-13 | Hak-Moo Lee | Firewall system combined with embedded hardware and general-purpose computer |
US20120017262A1 (en) | 2000-09-25 | 2012-01-19 | Harsh Kapoor | Systems and methods for processing data flows |
US7036064B1 (en) | 2000-11-13 | 2006-04-25 | Omar Kebichi | Synchronization point across different memory BIST controllers |
US20070118596A1 (en) * | 2000-12-05 | 2007-05-24 | Microsoft Corporation | System and method for implementing a client side http stack |
US20020167965A1 (en) * | 2001-01-18 | 2002-11-14 | James Beasley | Link context mobility method and system for providing such mobility, such as a system employing short range frequency hopping spread spectrum wireless protocols |
US20070008976A1 (en) | 2001-04-11 | 2007-01-11 | Aol Llc | Local Protocol Server |
US7002979B1 (en) | 2001-08-10 | 2006-02-21 | Utstarcom, Inc. | Voice data packet processing system |
US20030084190A1 (en) | 2001-10-25 | 2003-05-01 | Kimball Robert H. | Apparatus and system for maintaining accurate time in a wireless environment |
US20040210320A1 (en) | 2002-06-11 | 2004-10-21 | Pandya Ashish A. | Runtime adaptable protocol processor |
US7886340B2 (en) | 2002-06-13 | 2011-02-08 | Engedi Technologies | Secure remote management appliance |
US7334124B2 (en) | 2002-07-22 | 2008-02-19 | Vormetric, Inc. | Logical access block processing protocol for transparent secure file storage |
US20040049624A1 (en) | 2002-09-06 | 2004-03-11 | Oak Technology, Inc. | Network to computer internal interface |
US20040143751A1 (en) | 2003-01-17 | 2004-07-22 | Cyrus Peikari | Protection of embedded processing systems with a configurable, integrated, embedded firewall |
US20080109665A1 (en) | 2003-02-14 | 2008-05-08 | International Business Machines Corporation | Network processor power management |
US20070022421A1 (en) | 2003-04-09 | 2007-01-25 | Eric Lescouet | Operating systems |
US7657933B2 (en) | 2003-04-12 | 2010-02-02 | Cavium Networks, Inc. | Apparatus and method for allocating resources within a security processing architecture using multiple groups |
US7509673B2 (en) | 2003-06-06 | 2009-03-24 | Microsoft Corporation | Multi-layered firewall architecture |
CN1622517A (en) | 2003-11-27 | 2005-06-01 | 上海安创信息科技有限公司 | An embedded information security platform |
US7246272B2 (en) | 2004-01-16 | 2007-07-17 | International Business Machines Corporation | Duplicate network address detection |
US7770179B1 (en) | 2004-01-30 | 2010-08-03 | Xilinx, Inc. | Method and apparatus for multithreading on a programmable logic device |
US20060026162A1 (en) | 2004-07-19 | 2006-02-02 | Zoran Corporation | Content management system |
US20100131729A1 (en) | 2004-12-21 | 2010-05-27 | Koninklijke Philips Electronics N.V. | Integrated circuit with improved device security |
US20060133370A1 (en) | 2004-12-22 | 2006-06-22 | Avigdor Eldar | Routing of messages |
US7694158B2 (en) * | 2005-04-19 | 2010-04-06 | Stmicroelectronics S.R.L. | Parallel processing method and system, for instance for supporting embedded cluster platforms, computer program product therefor |
US7734933B1 (en) | 2005-06-17 | 2010-06-08 | Rockwell Collins, Inc. | System for providing secure and trusted computing environments through a secure computing module |
US8132001B1 (en) | 2005-10-19 | 2012-03-06 | Sprint Communications Company L.P. | Secure telephony service appliance |
US20070211633A1 (en) * | 2006-03-13 | 2007-09-13 | Microsoft Corporation | Competitive and considerate congestion control |
US20070255861A1 (en) | 2006-04-27 | 2007-11-01 | Kain Michael T | System and method for providing dynamic network firewall with default deny |
US20100005323A1 (en) * | 2006-06-07 | 2010-01-07 | Yuki Kuroda | Semiconductor integrated circuit |
US20080046891A1 (en) | 2006-07-12 | 2008-02-21 | Jayesh Sanchorawala | Cooperative asymmetric multiprocessing for embedded systems |
US20090126003A1 (en) | 2007-05-30 | 2009-05-14 | Yoggie Security Systems, Inc. | System And Method For Providing Network And Computer Firewall Protection With Dynamic Address Isolation To A Device |
US8055822B2 (en) * | 2007-08-21 | 2011-11-08 | International Business Machines Corporation | Multicore processor having storage for core-specific operational data |
US20110002184A1 (en) | 2007-10-09 | 2011-01-06 | Samsung Electronics Co., Ltd. | Method of detecting a light attack against a memory device and memory device employing a method of detecting a light attack |
US20090158299A1 (en) * | 2007-10-31 | 2009-06-18 | Carter Ernst B | System for and method of uniform synchronization between multiple kernels running on single computer systems with multiple CPUs installed |
TW200924424A (en) | 2007-11-21 | 2009-06-01 | Inventec Corp | System for intrusion detection system |
US20090235263A1 (en) * | 2008-03-17 | 2009-09-17 | Fujitsu Limited | Job assignment apparatus, job assignment method, and computer-readable medium |
US20100115116A1 (en) * | 2008-11-03 | 2010-05-06 | Micron Technology, Inc. | System and method for switching communication protocols in electronic interface devices |
US20100192225A1 (en) | 2009-01-28 | 2010-07-29 | Juniper Networks, Inc. | Efficient application identification with network devices |
US20110088037A1 (en) | 2009-10-13 | 2011-04-14 | Roman Glistvain | Single-stack real-time operating system for embedded systems |
US20110107357A1 (en) | 2009-11-03 | 2011-05-05 | Ian Henry Stuart Cullimore | TCP/IP Stack-Based Operating System |
WO2011056808A1 (en) | 2009-11-03 | 2011-05-12 | Iota Computing, Inc. | Tcp/ip stack-based operating system |
US20120042088A1 (en) | 2009-11-03 | 2012-02-16 | Ian Henry Stuart Cullimore | TCP/IP Stack-Based Operating System |
Non-Patent Citations (40)
Title |
---|
"Yoggie Pico Personal Security Appliance," www.yoggie.com. (archived on May 31, 2009) [Accessed Feb. 16, 2011-Archive.org]. |
"Yoggie Security Unveils Miniature Hardware Appliance," www.yoggie.com. (archived on May 31, 2009) [Accessed Feb. 16, 2011-Archive.org]. |
"Yoggie Unveils Miniature Internet Security Devices for Mac Computers," M2 Telecomworldwire,Oct. 14, 2008. [Accessed Feb. 18, 2011-Academic Source Complete]. |
Antoniou, S. "Networking Basics: TCP, UDP, TCP/IP and OSI Model," Oct. 29, 2007, (retrieved Jun. 4, 2013) 8 pages. |
Antoniou, S. "Networking Basics: TCP, UDP, TCP/IP and OSI Model," Oct. 29, 2007, <www.translingal.com/blog/networking-basics-tco=udp-tcpip-osi-models> (retrieved Jun. 4, 2013) 8 pages. |
Ashkenazi et al. "Platform Independent Overall Security Architecture in Multi-Processor System-On-Chip ICs for Use in Mobile Phones and Handheld Devices," World Automation Congress, Jul. 24-26, 2006. [Accessed Feb. 18, 2011-Engineering Village]. |
Bathen et al. "Inter and Intra Kernel Reuse Analysis Driven Pipelining on Chip-Multiprocessors," Intemational Symposium on VLSI Design, Automation and Test, Apr. 26-29, 2010. p. 203-207. [Accessed Feb. 16, 2011-IEEExplore] http://ieeexplore.ieee.org/xpis/abs all.jsp?amumber=5496725. |
Bathen et al. "Inter and Intra Kernel Reuse Analysis Driven Pipelining on Chip-Multiprocessors," International Symposium on VLSI Design, Automation and Test, Apr. 26-29, 2010. p. 203-206. [Accessed Feb. 16, 2011-IEEExplore] http://ieeexplore.ieee.org/xpis/abs all.jsp?amumber=5496725. |
Benini et al.: "Finite-state machine partitioning for low power," 1998, IEEE. |
Bolchini et al. "Smart Card Embedded Information Systems: A Methodology for Privacy Oriented Architectural Design," Data & Knowledge Engineering, 2002. vol. 41, No. 2-3, p. 159-182. [Accessed Feb. 16, 2011-ScienceDirect.com]. |
Bolchini et al. "Smart Card Embedded Information Systems: A Methodology for Privacy Oriented Architectural Design," Data & Knowledge Engineering, 2002. vol. 41, p. 159-182. [Accessed Feb. 16, 2011-ScienceDirect.com]. |
Cavium Networks, "Nitrox® DPI L7 Content Processor Family," Accessed on Feb. 16, 2011 at http://www.caviumnetworks.com/processor-NITROX-DPI.html. |
Cavium Networks, "Nitrox® Lite," Accessed on Feb. 16, 2011 at http://www.caviumnetworks.com/processor-securitLnitroxLite.htm. |
Ferrante et al. "Application-Driven Optimization of VLIW Architectures: A Hardware-Software Approach," Proceedings of the 11th IEEE Real Time and Embedded Technology and Applications Symposium, Mar. 7-10, 2005. p. 128-137. [Accessed Feb. 15, 2011-IEEExplore] http://ieeexplore.ieee.org/xpls/abs-all.jsp?arnumber=1388380. |
Ferrante et al. "Application-Driven Optimization ofVLIW Architectures: A Hardware-Software Approach," 11th IEEE Real Time and Embedded Technology and Applications Symposium, Mar. 7-10, 2005. pp. 128-137. [Accessed Feb. 15, 2011-IEEExplore] http://ieeexplore.ieee.org/xpls/abs-all.jsp?arnumber=1388380. |
Freescale Semiconductor, "IP Multimedia Subsystems," 2006. (brochure) [Accessed Feb. 16, 2011] http://cacheJreescale.com/files/32biUdoc/brochure/BRIMSSOLUTIONS.pdf. |
Green Hills Software, "1-I-velOSityTM Real-Time Microkemel," Accessed on Feb. 16, 2011 at http://www.ghs.com/products/micro-velosity. html. |
Green Hills Software, "mu-velOSity Real-Time Microkernel," Accessed on Feb. 16, 2011 at http://www.ghs.com/products/micro-velosity. html. |
Green Hills Software, "μ-velOSity Real-Time Microkernel," Accessed on Feb. 16, 2011 at http://www.ghs.com/products/micro—velosity. html. |
Green Hills Software, Inc., "mu-velOSity Microkernel," (datasheet) 2006. |
Green Hills Software, Inc., "mu-velOSity Microkernel," (datasheet-2pgs.) 2006. |
Green Hills Software, Inc., "μ-velOSity Microkernel," (datasheet) 2006. |
Green Hills Software, Inc., "μ-velOSity Microkernel," (datasheet—2pgs.) 2006. |
Hattori. "Challenges for Low-Power Embedded SOC's," International Symposium on VLSI Design, Automation and Test, Apr. 25-27, 2007. 4pgs. [Accessed Feb. 16, 2011-IEEExplore] http://ieeexplore.ieee.org/xpis/abs-all.jsp? arnumber=4239406. |
Hattori. "Challenges for Low-Power Embedded SOC's," International Symposium on VLSI Design, Automation and Test, Apr. 25-27, 2007. p. 1. [Accessed Feb. 16, 2011-IEEExplore] http://ieeexplore.ieee.org/xpis/abs-all.jsp?arnumber=4239406. |
International Search Report and Written Opinion mailed Dec. 30, 2010 in Patent Cooperation Treaty application No. PCT/US10/55186, filed Nov. 2, 2010. |
Joumal of Techonology & Science, "Express Logic, Inc.; Express Logic and IAR Systems Team Up to Provide ThreadX RTOS Support in IAR Embedded Workbench IDE for Freescale ColdFire," Accessed on Feb. 16, 2011 at http://proquest.umi.com.mutex.gmu.edu/pqdweb?index=7 &did=1541305 . . . . |
Journal of Technology & Science, "Express Logic, Inc.; Express Logic and IAR Systems Team Up to Provide ThreadX RTOS Support in IAR Embedded Workbench IDE for Freescale ColdFire," Accessed on Feb. 16, 2011 at http://proquest.umi.com.mutex.gmu.edu/pqdweb?index=7 &did=1541305 . . . . |
Kakarountas et al. "Implementation of HSSec: A High-Speed Cryptographic Co-Processor," IEEE Conference On Emerging Technologies and Factory Automation, Sep. 25-28, 2007. p. 625-631. [Accessed Feb. 16, 2011-IEEExplore] http://ieeexplore.ieee.org/xpls/abs-all.jsp?amumber=4416827. |
Ke et al. "Design of PC/1 04 Processor Module Based on ARM," International Conference on Electrical and Control Engineering, Jun. 25-27, 2010. p. 775-777. [Accessed Feb. 17, 2011-IEEExplore] http://ieeexplore.ieee.org/xpis/abs-all.jsp?arnumber=5630566. |
Ke et al. "Design of PC/104 Processor Module Based on ARM," International Conference on Electrical and Control Engineering, Jun. 25-27, 2010. p. 775-777. [Accessed Feb. 17, 2011-IEEExplore] http://ieeexplore.ieee.org/xpis/abs-all.jsp?arnumber=5630566. |
Kinebuchi et al. "A Hardware Abstraction Layer for Integrating Real-Time and General-Purpose with Minimal Kernel Modification," Software Technologies for Future Dependable Distributed Systems, Mar. 17, 2009. p. 112-116.[Accessed Feb. 16, 2011-IEEExplore] http://ieeexplore.ieee.org/xpls/abs-all.jsp?arnumber=4804582. |
Nguyen et al. "Real-Time Operating Systems for Small Microcontrollers," IEEE Micro, Sep.-Oct. 2009. vol. 29, No. 5, p. 30-45. [Accessed Feb. 15, 2011-IEEExplore] http://ieeexplore.ieee.org/xpis/abs-all.jsp?arnumber=5325154. |
Quan Huang et al.: "Embedded firewall based on network processor", 2005, IEEE, Proceedings of the Second International Conference on Embedded Software and Systems (ICESS'05), 7 pages. |
Tabari, et al. "Neural Network Processor for a FPGA-based Multiband Fluorometer Device," Intemational Workshop on Computer Architecture for Machine Perception and Sensing, Aug. 18-20, 2006. p. 198-202. [Accessed Feb. 16, 2011-IEEExplore] http://ieeexplore.ieee.org/xpls/abs-all.jsp?amumber=4350381. |
Tabari, et al. "Neural Network Processor for a FPGA-based Multiband Fluorometer Device," International Workshop on Computer Architecture for Machine Perception and Sensing, Sep. 2006. p. 198-202. [Accessed Feb. 16, 2011-IEEExplore] http://ieeexplore.ieee.org/xpls/abs-all.jsp?amumber=4350381. |
Tan et al.: "A simulation framework for energy-consumption analysis of OS-driven embedded applications," IEEE, vol. 22, No. 9, Sep. 2003. |
Wang et al. "Towards High-Performance Network Intrusion Prevention System on Multi-core Network Services Processor," 15th Intemational Conference on Parallel and Distributed Systems, Dec. 8-11, 2009. p. 220-227. [Accessed Feb. 16, 2011-IEEExplore]. |
Wang et al. "Towards High-Performance Network Intrusion Prevention System on Multi-core Network Services Processor," 15th International Conference on Parallel and Distributed Systems, Dec. 8-11, 2009. p. 220-227. [Accessed Feb. 16, 2011-IEEExplore]. |
Wong, William, "16-Bit MCU Invades 8-Bit Territory with 4-by 4-mm Chip," Electronic Design, Sep. 29, 2005. vol. 53, No. 21, p. 32. [Accessed Feb. 16, 2011-Academic Search Complete]. |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9436521B2 (en) | 2009-11-03 | 2016-09-06 | Iota Computing, Inc. | TCP/IP stack-based operating system |
US9705848B2 (en) | 2010-11-02 | 2017-07-11 | Iota Computing, Inc. | Ultra-small, ultra-low power single-chip firewall security device with tightly-coupled software and hardware |
US8875276B2 (en) | 2011-09-02 | 2014-10-28 | Iota Computing, Inc. | Ultra-low power single-chip firewall security device, system and method |
US8904216B2 (en) | 2011-09-02 | 2014-12-02 | Iota Computing, Inc. | Massively multicore processor and operating system to manage strands in hardware |
US10725972B2 (en) | 2013-03-21 | 2020-07-28 | Razer (Asia-Pacific) Pte. Ltd. | Continuous and concurrent device experience in a multi-device ecosystem |
US10515056B2 (en) | 2013-03-21 | 2019-12-24 | Razer (Asia-Pacific) Pte. Ltd. | API for resource discovery and utilization |
US9858052B2 (en) | 2013-03-21 | 2018-01-02 | Razer (Asia-Pacific) Pte. Ltd. | Decentralized operating system |
US11483127B2 (en) | 2018-11-18 | 2022-10-25 | Mellanox Technologies, Ltd. | Clock synchronization |
US11637557B2 (en) | 2018-11-26 | 2023-04-25 | Mellanox Technologies, Ltd. | Synthesized clock synchronization between network devices |
US10778406B2 (en) | 2018-11-26 | 2020-09-15 | Mellanox Technologies, Ltd. | Synthesized clock synchronization between networks devices |
US11283454B2 (en) | 2018-11-26 | 2022-03-22 | Mellanox Technologies, Ltd. | Synthesized clock synchronization between network devices |
US11543852B2 (en) | 2019-11-07 | 2023-01-03 | Mellanox Technologies, Ltd. | Multihost clock synchronization |
US11070304B1 (en) | 2020-02-25 | 2021-07-20 | Mellanox Technologies, Ltd. | Physical hardware clock chaining |
US12081427B2 (en) | 2020-04-20 | 2024-09-03 | Mellanox Technologies, Ltd. | Time-synchronization testing in a network element |
US11552871B2 (en) | 2020-06-14 | 2023-01-10 | Mellanox Technologies, Ltd. | Receive-side timestamp accuracy |
US11606427B2 (en) | 2020-12-14 | 2023-03-14 | Mellanox Technologies, Ltd. | Software-controlled clock synchronization of network devices |
US11588609B2 (en) | 2021-01-14 | 2023-02-21 | Mellanox Technologies, Ltd. | Hardware clock with built-in accuracy check |
US12111681B2 (en) | 2021-05-06 | 2024-10-08 | Mellanox Technologies, Ltd. | Network adapter providing isolated self-contained time services |
US12028155B2 (en) | 2021-11-24 | 2024-07-02 | Mellanox Technologies, Ltd. | Controller which adjusts clock frequency based on received symbol rate |
US11907754B2 (en) | 2021-12-14 | 2024-02-20 | Mellanox Technologies, Ltd. | System to trigger time-dependent action |
US11835999B2 (en) | 2022-01-18 | 2023-12-05 | Mellanox Technologies, Ltd. | Controller which adjusts clock frequency based on received symbol rate |
US11706014B1 (en) | 2022-01-20 | 2023-07-18 | Mellanox Technologies, Ltd. | Clock synchronization loop |
US11917045B2 (en) | 2022-07-24 | 2024-02-27 | Mellanox Technologies, Ltd. | Scalable synchronization of network devices |
US12216489B2 (en) | 2023-02-21 | 2025-02-04 | Mellanox Technologies, Ltd | Clock adjustment holdover |
Also Published As
Publication number | Publication date |
---|---|
EP2751700A4 (en) | 2015-02-25 |
US8904216B2 (en) | 2014-12-02 |
WO2013032660A1 (en) | 2013-03-07 |
US20130061078A1 (en) | 2013-03-07 |
US20130061070A1 (en) | 2013-03-07 |
EP2751700A1 (en) | 2014-07-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8607086B2 (en) | 2013-12-10 | Massively multicore processor and operating system to manage strands in hardware |
Shantharama et al. | 2020 | Hardware-accelerated platforms and infrastructures for network functions: A survey of enabling technologies and research studies |
TWI556092B (en) | 2016-11-01 | Priority based application event control (paec) to reduce power consumption |
US7376952B2 (en) | 2008-05-20 | Optimizing critical section microblocks by controlling thread execution |
KR101516109B1 (en) | 2015-04-29 | Reducing power consumption of uncore circuitry of a processor |
KR101476568B1 (en) | 2014-12-24 | Providing per core voltage and frequency control |
KR101572079B1 (en) | 2015-11-27 | Providing state storage in a processor for system management mode |
JP5809366B2 (en) | 2015-11-10 | Method and system for scheduling requests in portable computing devices |
US20050262365A1 (en) | 2005-11-24 | P-state feedback to operating system with hardware coordination |
US20160266633A1 (en) | 2016-09-15 | Methods and Systems for Coordination of Operating States amongst Multiple SOCs within a Computing Device |
US20120297216A1 (en) | 2012-11-22 | Dynamically selecting active polling or timed waits |
US11853787B2 (en) | 2023-12-26 | Dynamic platform feature tuning based on virtual machine runtime requirements |
US20070113229A1 (en) | 2007-05-17 | Thread aware distributed software system for a multi-processor |
US11422849B2 (en) | 2022-08-23 | Technology for dynamically grouping threads for energy efficiency |
US11640305B2 (en) | 2023-05-02 | Wake-up and timer for scheduling of functions with context hints |
US11921558B1 (en) | 2024-03-05 | Using network traffic metadata to control a processor |
US20220329450A1 (en) | 2022-10-13 | Device wake-up technologies |
KR20240159790A (en) | 2024-11-06 | Embedded system execution method and device, embedded system and chip |
US20230153121A1 (en) | 2023-05-18 | Accelerator usage prediction for improved accelerator readiness |
US20250004535A1 (en) | 2025-01-02 | Optimized power management for computer systems |
Tuveri et al. | 2013 | A runtime adaptive H. 264 video-decoding MPSoC platform |
US20250013493A1 (en) | 2025-01-09 | Frequency scaling in multi-tenant environments |
US20230305927A1 (en) | 2023-09-28 | Register replay state machine |
US20220155847A1 (en) | 2022-05-19 | Technologies for a processor to enter a reduced power state while monitoring multiple addresses |
Johansson et al. | 2024 | Enhancing Infrastructure to Boost Energy Efficiency in 5G and 6G Core Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
2012-01-06 | AS | Assignment |
Owner name: IOTA COMPUTING, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CULLIMORE, IAN HENRY STUART;REEL/FRAME:027496/0618 Effective date: 20110901 |
2013-11-20 | STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
2017-05-25 | FPAY | Fee payment |
Year of fee payment: 4 |
2021-08-02 | FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
2022-01-17 | LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
2022-01-17 | STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
2022-02-08 | FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20211210 |