`

关于size_t 和 ptrdiff_t 【转】

 
阅读更多

Abstract

The article will help the readers understand what size_t and ptrdiff_t types are, what they are used for and when they must be used. The article will be interesting for those developers who begin creation of 64-bit applications where use of size_t and ptrdiff_t types provides high performance, possibility to operate large data sizes and portability between different platforms.

Introduction

Before we begin I would like to notice that the definitions and recommendations given in the article refer to the most popular architectures for the moment (IA-32, Intel 64, IA-64) and may not fully apply to some exotic architectures.

The types size_t and ptrdiff_t were created to perform correct address arithmetic. It had been assumed for a long time that the size of int coincides with the size of a computer word (microprocessor's capacity) and it can be used as indexes to store sizes of objects or pointers. Correspondingly, address arithmetic was built with the use of int and unsigned types as well. int type is used in most training materials on programming in C and C++ in the loops' bodies and as indexes. The following example is nearly a canon:

for (int i = 0; i < n; i++)
  a[i] = 0;

As microprocessors developed over time and their capacity increased, it became irrational to further increase int type's sizes. There are a lot of reasons for that: economy of memory used, maximum portability etc. As a result, several data model appeared declaring the relations of C/C++ base types. Table N1 shows the main data models and lists the most popular systems using them.

Table N1. Data models

Table N1. Data models

As you can see from the table, it is not so easy to choose a variable's type to store a pointer or an object's size. To find the smartest solution of this problem size _t and ptrdiff_t types were created. They are guaranteed to be used for address arithmetic. And now the following code must become a canon:

for(ptrdiff_t i =0; i < n; i++)
  a[i]=0;

 It is this code that can provide safety, portability and good performance. The rest of the article explains why.

size_t type

size_t type is a base unsigned integer type of C/C++ language. It is the type of the result returned by sizeof operator. The type's size is chosen so that it could store the maximum size of a theoretically possible array of any type. On a 32-bit system size_t will take 32 bits, on a 64-bit one 64 bits. In other words, a variable of size_t type can safely store a pointer. The exception is pointers to class functions but this is a special case. Although size_t can store a pointer, it is better to use another unsinged integer type uintptr_t for that purpose (its name reflects its capability). The types size_t and uintptr_t are synonyms. size_t type is usually used for loop counters, array indexing and address arithmetic.

The maximum possible value of size_t type is constant SIZE_MAX.

ptrdiff_t type

ptrdiff_t type is a base signed integer type of C/C++ language. The type's size is chosen so that it could store the maximum size of a theoretically possible array of any type. On a 32-bit system ptrdiff_t will take 32 bits, on a 64-bit one 64 bits. Like in size_t, ptrdiff_t can safely store a pointer except for a pointer to a class function. Also, ptrdiff_t is the type of the result of an expression where one pointer is subtracted from the other (ptr1-ptr2). ptrdiff_t type is usually used for loop counters, array indexing, size storage and address arithmetic. ptrdiff_t type has its synonym intptr_t whose name indicates more clearly that it can store a pointer.

Portability of size_t and ptrdiff_t

The types size_t and ptrdiff_t enable you to write well-portable code. The code created with the use of size_t and ptrdiff_t types is easy-portable. The size of size_t and ptrdiff_t always coincide with the pointer's size. Because of this, it is these types that should be used as indexes for large arrays, for storage of pointers and pointer arithmetic.

Linux-application developers often use long type for these purposes. Within the framework of 32-bit and 64-bit data models accepted in Linux, this really works. long type's size coincides with the pointer's size. But this code is incompatible with Windows data model and, consequently, you cannot consider it easy-portable. A more correct solution is to use types size_t and ptrdiff_t.

As an alternative to size_t and ptrdiff_t, Windows-developers can use types DWORD_PTR, SIZE_T, SSIZE_T etc. But still it is desirable to confine to size_t and ptrdiff_t types.

Safety of ptrdiff_t and size_t types in address arithmetic

Address arithmetic issues have been occurring very frequently since the beginning of adaptation of 64-bit systems. Most problems of porting 32-bit applications to 64-bit systems relate to the use of such types as int and long which are unsuitable for working with pointers and type arrays. The problems of porting applications to 64-bit systems are not limited by this, but most errors relate to address arithmetic and operation with indexes.

Here is a simplest example:

size_t n =...;
for(unsigned i =0; i < n; i++)
  a[i]=0;

If we deal with the array consisting of more than UINT_MAX items, this code is incorrect. It is not easy to detect an error and predict the behavior of this code. The debug-version will hung but hardly will anyone process gigabytes of data in the debug-version. And the release-version, depending on the optimization settings and code's peculiarities, can either hung or suddenly fill all the array cells correctly producing thus an illusion of correct operation. As a result, there appear floating errors in the program occurring and vanishing with a subtlest change of the code. To learn more about such phantom errors and their dangerous consequences see the article "A 64-bit horse that can count" [1].

Another example of one more "sleeping" error which occurs at a particular combination of the input data (values of A and B variable):

int A =-2;
unsigned B =1;
int array[5]={1,2,3,4,5};
int*ptr = array +3;
ptr = ptr +(A + B);//Error
printf("%i\n",*ptr);

 This code will be correctly performed in the 32-bit version and print number "3". After compilation in 64-bit mode there will be a fail when executing the code. Let's examine the sequence of code execution and the cause of the error:

  • A variable of int type is cast into unsigned type;
  • A and B are summed. As a result, we get 0xFFFFFFFF value of unsigned type;
  • "ptr + 0xFFFFFFFFu" expression is calculated. The result depends on the pointer's size on the current platform. In the 32-bit program, the expression will be equal to "ptr - 1" and we will successfully print number 3. In the 64-bit program, 0xFFFFFFFFu value will be added to the pointer and as a result, the pointer will be far beyond the array's limits.

注: 哪个能表示更大的数就转为哪个类型。例如short+long,就要转为long.
在32位机上, unsigned int 最大可表示2^32 - 1; int最大可表示2^31-1, 这样int就转为了unsigned int

 

Such errors can be easily avoided by using size_t or ptrdiff_t types. In the first case, if the type of "i" variable is size_t, there will be no infinite loop. In the second case, if we use size_t or ptrdiff_t types for "A" and "B" variable, we will correctly print number "3".

Let's formulate a guideline: wherever you deal with pointers or arrays you should use size_t and ptrdiff_t types.

To learn more about the errors you can avoid by using size_t and ptrdiff_t types, see the following articles:

Performance of code using ptrdiff_t and size_t

Besides code safety, the use of ptrdiff_t and size_t types in address arithmetic can give you an additional gain of performance. For example, using int type as an index, the former's capacity being different from that of the pointer, will lead to that the binary code will contain additional data conversion commands. We speak about 64-bit code where pointers' size is 64 bits and int type's size remains 32 bits.

It is a difficult task to give a brief example of size_t type's advantage over unsigned type. To be objective we should use the compiler's optimizing abilities. And the two variants of the optimized code frequently become too different to show this very difference. We managed to create something like a simple example only with a sixth try. And still the example is not ideal because it demonstrates not those unnecessary data type conversions we spoke above, but that the compiler can build a more efficient code when using size_t type. Let's consider a program code arranging an array's items in the inverse order:

unsigned arraySize;
...
for(unsigned i =0; i < arraySize /2; i++){
  float value = array[i];
  array[i]= array[arraySize - i -1];
  array[arraySize - i -1]= value;
}

In the example, "arraySize" and "i" variables have unsigned type. This type can be easily replaced with size_t type, and now compare a small fragment of assembler code shown on Figure 1.

Figure N1.Comparison of 64-bit assembler code when using unsigned and size_t types

Figure N1.Comparison of 64-bit assembler code when using unsigned and size_t types

The compiler managed to build a more laconic code when using 64-bit registers. I am not affirming that the code created with the use of unsigned type will operate slower than the code using size_t. It is a very difficult task to compare speeds of code execution on modern processors. But from the example you can see that when the compiler operates arrays using 64-bit types it can build a shorter and faster code.

Proceeding from my own experience I can say that reasonable replacement of int and unsigned types with ptrdiff_t and size_t can give you an additional performance gain up to 10% on a 64-bit system. You can see an example of speed increase when using ptrdiff_t and size_t types in the fourth section of the article "Development of Resource-intensive Applications in Visual C++" [5].

Code refactoring with the purpose of moving to ptrdiff_t and size_t

As the reader can see, using ptrdiff_t and size_t types gives some advantages for 64-bit programs. However, it is not a good way out to replace all unsigned types with size_t ones. Firstly, it does not guarantee correct operation of a program on a 64-bit system. Secondly, it is most likely that due to this replacement, new errors will appear data format compatibility will be violated and so on. You should not forget that after this replacement the memory size needed for the program will greatly increase as well. And increase of the necessary memory size will slow down the application's work for cache will store fewer objects being dealt with.

Consequently, introduction of ptrdiff_t and size_t types into old code is a task of gradual refactoring demanding a great amount of time. In fact, you should look through the whole code and make the necessary alterations. Actually, this approach is too expensive and inefficient. There are two possible variants:

  • To use specialized tools like Viva64 included into PVS-Studio. Viva64 is a static code analyzer detecting sections where it is reasonable to replace data types for the program to become correct and work efficiently on 64-bit systems. To learn more, see "PVS-Studio Tutorial" [6].
  • If you do not plan to adapt a 32-bit program for 64-bit systems, there is no sense in data types' refactoring. A 32-bit program will not benefit in any way from using ptrdiff_t and size_t types.

References

本文转自:

http://www.viva64.com/en/a/0050/ 

分享到:
评论

相关推荐

    QAC工具介绍和使用说明(供一种可量化措施的代码度量值属性:33基于功能 32基于文件和4个项目级别)

    如上所述:QAC随提供一套标准库的头文件,如果想改变这些类型定义,必须先明白QAC内部的定义类型,因为那些头文件包含一些声明ptrdiff_t, size_t 和wchar_t,还有3种宏指令定义PRQA_PTRDIFF_T, PRQA_SIZE_T,和PRQA_...

    stl_interfaces:用于定义迭代器的C ++ 14和更高版本的CRTP模板

    stl_interfaces Boost.Iterator的iterator_facade和iterator_adaptor部分(现在称为iterator_interface )的更新的C ++ 20友好版本;... using difference_type = std:: ptrdiff_t ; using pointe

    Google C++ Style Guide(Google C++编程规范)高清PDF

    link ▶Don't use an #include when a forward declaration would suffice. When you include a header file you introduce a dependency that will cause your code to be recompiled whenever the header file ...

    span-lite:span lite-单文件标头库中的C ++ 98,C ++ 11和更高版本的类似于C ++ 20的跨度

    include &lt; iostream&gt;std:: ptrdiff_t size ( nonstd::span&lt; const&gt; spn ){ return spn. size ();}int main (){ int arr[] = { 1 , }; std::cout &lt;&lt; " C-array: " &lt;&lt; size ( arr ) &lt;&lt; " array: " &...

    Python课程设计 课设 手写数字识别卷积神经网络源码+文档说明.zip

    高分设计源码,详情请查看资源内容中使用说明 高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明高分设计源码,详情请查看资源内容中使用说明

    SpringBoot2.0快速开发框架权限.rar

    SpringBoot2.0快速开发框架权限.rarSpringBoot2.0快速开发框架权限.rarSpringBoot2.0快速开发框架权限.rar

    大语言模型的微调和推理baichuan7B, chatglm2-6B, Qwen-7B-chat源码.zip

    详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;详情请查看资源内容中使用说明;

    基于Qt与STM32平台开发的汽车车机系统上位机

    基于Qt开发的汽车车机系统上位机 & 常见类型汽车传感器信号模拟发生器 任务和要求: 任务: 根据发动机测控系统信号需求,设计一套发动机信号模拟器人机交互系统,能够根据需要向下位机输出控制信号,使其输出发动机测控系统需要的传感器模拟信号,给发动机测控系统的开发提供方便。 要求: 1.设计应包含上位机与下位机的交互程序及人机交互界面的设计,与下位机设计相结合,使其能够实现全部类型发动机传感器信号的模拟输出及显示。 2.设计中需要采用模块化开发程序。 3.所设计的人机交互界面简洁合理。 4.应考虑所设计系统的实用性。 具体工作内容: 1.根据设计目标,查阅相关设计标准和设计方法资料,对发动机信号模拟器设计中的关键工程原理和工程方法进行提炼,并围绕关键问题进行国内外设计现状调研,开展分析、评价与总结,确定主要研究内容,制定设计技术路线,制定设计计划(周进度),撰写开题报告,并进行开题答辩,开题报告参考文献应不少于15篇(其中外文文献不少于 4 篇,近五年文献不少于三分之一)。 2.根据设计要求和技术指标,进行满足功能原理需求的多方案拟定,考虑安全、 标准等多因素进行技术性与经济性评价

    实验-三、数据库安全性(目的、要求和模板).doc

    实验-三、数据库安全性(目的、要求和模板).doc

    毕设绝技 - 4天玩乐完成商城系统完整资料day02

    文件为第二天视频教程 在毕业设计的挑战中,有时我们需要以极短的时间完成一个相对复杂的项目,比如一个商城系统。虽然时间紧迫,但只要我们合理规划、高效执行,完全有可能在4天内完成一个基础且功能完备的商城系统。 商城系统,也被称为网上商城系统或Online Mall system,是一种功能完善的网上销售系统。该系统主要包括产品发布、在线订购、在线支付、在线客服等功能模块,旨在为企业或个人提供一个在线销售平台,实现商品的展示、交易和客户服务。 商城系统具有多种核心功能,如商品管理、订单管理、用户管理和营销管理。商品管理功能支持商品的添加、编辑、删除、分类和搜索,满足商家对商品信息的全面管理需求。订单管理功能则涵盖订单的生成、支付、发货、退款和评价等环节,确保交易流程的顺畅进行。用户管理功能包括用户的注册、登录、个人信息管理和收货地址管理等,提升用户体验。而营销管理功能则通过促销活动的设置、优惠券的发放和积分兑换等手段,帮助商家提升销售业绩。 商城系统的特点主要体现在功能性、易用性和安全性上。商城系统注重功能性的开发,每个功能都有其发挥作用的地方,满足商家的实际需求。

    忻州师范学院-论文答辩PPT模板我给母校送模板作品.pptx

    PPT模板,答辩PPT模板,毕业答辩,学术汇报,母校模板,我给母校送模板作品,周会汇报,开题答辩,教育主题模板下载。PPT素材下载。

    小型餐饮管理系统-数据库设计报告.doc

    小型餐饮管理系统-数据库设计报告.doc

    毕业设计+Python+基于OpenCV的交通路口红绿灯控制系统设计+Sqlite +PyCharm 1.zip.zip

    本资源中的源码都是经过本地编译过可运行的,下载后按照文档配置好环境就可以运行。资源项目的难度比较适中,内容都是经过助教老师审定过的,应该能够满足学习、使用需求,如果有需要的话可以放心下载使用。有任何问题也可以随时私信博主,博主会第一时间给您解答!!! 本资源中的源码都是经过本地编译过可运行的,下载后按照文档配置好环境就可以运行。资源项目的难度比较适中,内容都是经过助教老师审定过的,应该能够满足学习、使用需求,如果有需要的话可以放心下载使用。有任何问题也可以随时私信博主,博主会第一时间给您解答!!! 本资源中的源码都是经过本地编译过可运行的,下载后按照文档配置好环境就可以运行。资源项目的难度比较适中,内容都是经过助教老师审定过的,应该能够满足学习、使用需求,如果有需要的话可以放心下载使用。有任何问题也可以随时私信博主,博主会第一时间给您解答!!

    西南交通大学-毕业答辩PPT模板我给母校送模板作品.pptx

    PPT模板,答辩PPT模板,毕业答辩,学术汇报,母校模板,我给母校送模板作品,周会汇报,开题答辩,教育主题模板下载。PPT素材下载。

    2024年中国中空纤维膜行业研究报告.docx

    2024年中国中空纤维膜行业研究报告

    四川师范大学-PPT模板我给母校送模板作品.pptx

    PPT模板,答辩PPT模板,毕业答辩,学术汇报,母校模板,我给母校送模板作品,周会汇报,开题答辩,教育主题模板下载。PPT素材下载。

    实验三、数据库安全性实验报告.doc

    实验三、数据库安全性实验报告.doc

    西北农林科技大学-PPT模板我给母校送模板作品.pptx

    PPT模板,答辩PPT模板,毕业答辩,学术汇报,母校模板,我给母校送模板作品,周会汇报,开题答辩,教育主题模板下载。PPT素材下载。

    java电子相册源码.rar

    java电子相册源码.rarjava电子相册源码.rarjava电子相册源码.rarjava电子相册源码.rar

    玉米脱粒机设计及其总装配图(论文、dwg图).rar

    玉米脱粒机设计及其总装配图(论文、dwg图)

Global site tag (gtag.js) - Google Analytics