New DRAM access technology doubled overclocking performance

Make it to the Right and Larger Audience

Blog

New DRAM access technology doubled overclocking performance

The frequency of a microprocessor can increase dramatically in many ways but is limited by the main memory’s performance and its frequency must be reduced to maintain the stability of the computer system. An access technology is reviewed to improve the access speed of dynamic random access memory (DRAM) cell by studying the reduced layout area of ​​static random access memory (SRAM) cell.

Overclocking and memory relevance

Increasing the supply voltage and lowering the ambient temperature helps increase the frequency of microprocessors, chipsets, and main memory, which are physical features that overclock computer systems, microprocessors, chipsets, main memories, motherboards. Maintaining stability of the operating system (OS) as well as the application program during execution is the software feature after overclocking.

In overclocking, some applications have frequent math calculations and a large amount of data access, which may occur beyond what is supported by the cooling efficiency of die packaging materials and external cooling devices. So the need for automatic overclocking technology to monitor the system and adjust frequency becomes critical. Another usage of automatic overclocking is to confirm if motherboards equipped with microprocessors, chipsets, and main memory will achieve the overclocking limit with an external heat sink. When the basic input / output system (BIOS) is equipped with the automatic overclocking function, the personal computer (PC) do not have to enter the OS, which is without having to connect any disk drive. You can quickly get overclocking limit and reduce disk drive wear.

Because microprocessor access to peripherals is handled through main memory, main memory stability affects the performance of the microprocessor. Even if the microprocessor can be overclocked, there must be a master capable of overclocking the microprocessor memory, which is what an overclocking memory module does.

Data transmission interface

Single Data Rate Synchronous Dynamic Random Access Memory (SDR SDRAM) data transfer interfaces address the DRAM access characteristics because DRAM needs to maintain its state of storage via update operation and requires additional writeback operations during reads. Although there are no extra operations for writing, it also takes some time to complete the storage. Because DRAM write-back time is far greater than the internal high-speed microprocessor speed, data transmission interface is specially designed. SDRAM is superior to other data transfer interfaces (such as Rambus DRAM; RDRAM) at a price / performance ratio after double data rate (DDR) development. Today, DDR SDRAM is also divided into standard and mobile.
Figure 1 shows a simplified functional block diagram of the SDRAM. The CAS # is a delay control signal designed for precharge. That is, without pre-charging, the column address strobe signal (RAS #) need not be controlled in a time-sharing manner. The frequencies of the differential frequency signals (CLK, CKE) are based on the operating frequency of the microprocessor and the data mask signal (DQM) corresponds to the edge of the differential frequency signal. These signals are used for synchronous transmission. For the configuration of sense amplifiers and write drivers, the same number of gates are usually designed based on the bit width of the external data bus. However, parallel access can be introduced to increase the access efficiency, thereby increasing the bit width of the row address by selecting a different row of sense amplifiers and write drivers. This method generates burst mode and same row accesses, but does not increase the speed of access and also requires synchronous transfers, so a data buffer is required.


Figure 1: A brief functional block diagram of SDRAM

Figure 2 shows the SDRAM command sequence, with reference of Micron Technology MT48H8M16LF (Mobile SDRAM) specifications table. Among the command sequences, the most simple command sequence is a single read and a single write. The sequence of commands shown in the figure is performed by first performing a precharge (PRE), then activating (ACT), and finally performing a read or write access (RD or WR), and so on.


Figure 2: SDRAM Command Sequence: Single Read or Single Write

In the figure, the frequency time (tCK) is the time from the current frequency edge to the next frequency edge. The column address strobe precharge time (tRP) is the time from the PRE command to the ACT command. Column address strobe to row address strobe delay (tRCD) is the time from the ACT command to the RD command or the WR command. The row address strobe latency (CL) is the wait time from the start of the RD command and is based on tCK and multiplied by the multiplier. The write time (tWR) is the time from the WR command to the PRE command. Alternatively, tCK can be used as the base unit and then multiplied by multiples, such as the glitch method of the glitch address strobe, hence the name row address strobe write latency (CWL). The column address strobe time (tRAS) is the time from the ACT command to the PRE command. The update command period (tRC) is the time from this ACT command until the next ACT command.

The main set parameters for DDR SDRAM on the PC are tRP, tRCD and CL. For overclocking the performance of the memory module is to take into account the minimum frequency, write time, and the maximum supply voltage.

 

Data transfer interface access efficiency

SDRAM access efficiency is high with the burst mode and the same column access. Data structure and data processing are also optimized at the software level if frequent access to the same column is required. Data structure optimization is like analyzing the data fields that are accessed frequently and then merging these data fields in the same data structure. These data fields can be stored in the same column address inside the main memory. The optimization of data processing is like reducing the number of interleaving and interleaving of different data structures.

If the burst mode occurs and the same column access probability is too low, the access efficiency will be greatly reduced, and the data transfer interface will transfer slower than the DRAM cell’s separate write speed. This can be understood from both a single read and a single write command sequence. Both perform precharge, but DRAM cells do not have to be precharged in write features. In the SDRAM transmission technology, software design for data processing will also affect the execution speed of the program code. If the computer software is not optimized for burst mode and to speed up the implementation, there will be three options, to overclock, to upgrade the main memory, and to upgrade the PC.

1T DRAM module overclocking performance

If 3T DRAM is the first generation of DRAM technology, the use of differential amplifier to read done by 1T DRAM is the second generation DRAM. Here the third generation of DRAM technology is to replace the differential amplifier and significantly improve the reading access. 1T DRAM memory unit is composed of a transistor and a capacitor storage unit, also known as 1T DRAM cells. Figure 3 shows the waveform of an access operation on a single memory cell. The upper half is the second generation of technology using a differential amplifier. The lower half is the third generation of technology. The diagram is mainly used to compare their maximum amount of time it takes to read a job. This capacitor must be immediately updated (read) when it reaches the minimum differential voltage (Min. ΔV). So this graph shows that the maximum time it takes to read equals to the max time it takes to update. The figure is labeled with tprecharge, tread, trewrite, twrite. These technical terms correspond to specifications of tRP, tRCD, CL and CWL. The third generation technology does not require tRP. In addition, tRCD is also very short and can be replaced by CWL CL. So the read speed is close to SRAM but update efficiency is lower than the SRAM.


Figure 3: Waveforms for a single storage unit read job

Figure 4 is used to observe the SDRAM command sequence changes for different access technologies. This graph is used to compare the access efficiency between the second generation and third generation technologies. There are many combinations of SDRAM command sequences. Among them, READ to WRITE can be used to highlight the difference between different access technologies. The third-generation technology has a short read-out time (Tread). So the CL value can be small, but it can not be equal to 0 due to the limitation of the differential frequency signal and the DQM signal.

Figure 4: SDRAM command sequence: read command to write command

Refer to the Samsung specification for K4A4G165WD, one of the specifications is DDR4-1600 (11-11-11) which says the frequency time (tCK) is 1.25 ns and the CWL is 9. 1600 is the transmission speed of the data transmission interface. The transmission speed of SDR is equal to the frequency. The transmission speed of DDR is 2 times of the frequency.

In Figure 2, tRCD and tRP correspond to the positive edge of the frequency signal, so the minimum of both is zero. In Figure 4, the minimum value of CL is limited by the frequency signal and is therefore 1. In addition, the value of the special mark CWL and its value is that the value after multiplication by CWL and tCK when normal operation must be greater than or equal to the write of the memory cell Twrite. When overclocking is performed, the update command must be executed more frequently without increasing the CWL value. In addition, data errors are more likely to occur due to inconsistencies in the access time of each memory cell due to IC process variations and leakage currents For stability it is necessary to increase the CL and CWL values, even with special cooling. When these parameters produce the third generation DRAM technology under the same manufacturing conditions, the minimum setting for the speed vessel can be (1-0-0) and the CWL value can be the same as the product specification above, so the third generation Access to DRAM technology can exceed that of overclocked memory modules manufactured with second generation DRAM technology before it is overclocked.

3T SRAM module overclocking performance

The 3T DRAM cell is the first DRAM cell to be implemented and has been developed into a 1T DRAM cell for drastically reducing the layout area of ​​an IC. In the same year, I found that the frequency of the microprocessor was limited by the DRAM frequency and attempted to form an SRAM with three transistors. This layout area is similar to that of a 3T DRAM cell. If SDRAM is replaced by synchronous static random access memory (SSRAM), then the access efficiency is much higher than the third generation DRAM, because of its CWL value can reach 0, under such conditions if there is no burst Mode and the same column access can also approach the data transfer interface transmission speed. If a 3T SRAM module is used for overclocking, the access time is proportional to the switching time of the transistor and the increase in temperature reduces the switching time, so CL and CWL configuration settings do not increase due to overclocking nor do they have to be specially cooled down .

Conclusion

According to a research report, DRAM cells can not properly execute the program code due to a soft error during reading, thus requiring the DRAM module to add an error correction code (ECC). So who has this problem in reverse thinking: there are many registers inside the microprocessor and chipset, why these studies did not explicitly indicate that these registers also use ECC to reduce soft errors?

After observing the waveform of access to a single cell by a second generation DRAM technology, we can see that differential amplifiers have a very low discrimination level and are therefore more susceptible to interference than those buffers and SRAM cells. The overclocking feature can increase the processing speed of the microprocessor, but the efficiency of accessing large amounts of data depends on the main memory technology, and those with poor access performance are more likely to cause the microprocessor to have no working time (NOP time ) To wait for access to the data, it is possible to increase the power consumption after overclocking, and it is hard to speed up the processing of fragmented data by the overclocking function.

 

 
Profile Photo
Freelancer at Freelancer
Author brief is empty
0 Comments

Contact Us

Thanks for helping us better serve the community. You can make a suggestion, report a bug, a misconduct, or any other issue. We'll get back to you using your private message ASAP.

Sending

©2020  ValPont.com

Forgot your details?