能力值:
( LV2,RANK:10 )
2 楼
谢谢分享! 用过前面几个版本, 比较了下代码优化效率还不如vs2008 ...... 更别提vs2013了
能力值:
( LV2,RANK:10 )
3 楼
开玩笑吧,兄弟,把你例子发上来
没见过有比这东西优化牛B的
对比过代码生成过
最极致就这东西
文件变大那就不说了,这东西生成很多分支代码,在内存比较CPU,然后去走哪部分
不过这东西在汇编写法上编译有些和VS不一样
__asm push 全局变量
VS编译是
push ds:[全局变量]
这货色直接编译这样
push offset 全局变量
严格了,当初找问题,害死很多脑细胞
能力值:
( LV12,RANK:340 )
4 楼
从网上找的一个快速CRC32的源码来测试,来源是:
dd5K9s2c8@1M7q4)9K6b7g2)9J5c8W2)9J5c8X3y4J5k6h3q4@1k6g2)9J5k6i4y4@1k6i4m8Z5j5h3&6Q4x3X3c8T1M7Y4g2E0L8h3g2Q4x3X3g2U0L8$3#2Q4x3V1k6U0M7X3x3K6x3W2)9J5c8R3`.`. 。稍微对源码做了点修改。
测试的数据大小是512 MB。
这是VS2015 release默认配置:
Please wait ... bitwise : CRC=8AC52C35, 5.064s, 101.097 MB/s half-byte : CRC=8AC52C35, 2.432s, 210.563 MB/s 1 byte at once: CRC=8AC52C35, 1.030s, 497.165 MB/s 4 bytes at once: CRC=8AC52C35, 0.423s, 1209.604 MB/s 8 bytes at once: CRC=8AC52C35, 0.371s, 1378.404 MB/s 4x8 bytes at once: CRC=8AC52C35, 0.318s, 1609.898 MB/s 16 bytes at once: CRC=8AC52C35, 0.228s, 2243.487 MB/s 16 bytes at once: CRC=8AC52C35, 0.230s, 2225.956 MB/s (including prefetching) chunked : CRC=8AC52C35, 0.230s, 2228.896 MB/s
这是PS XE 2016 release的:
Please wait ... bitwise : CRC=8AC52C35, 5.045s, 101.484 MB/s half-byte : CRC=8AC52C35, 2.421s, 211.484 MB/s 1 byte at once: CRC=8AC52C35, 1.223s, 418.477 MB/s 4 bytes at once: CRC=8AC52C35, 0.464s, 1103.497 MB/s 8 bytes at once: CRC=8AC52C35, 0.245s, 2086.200 MB/s 4x8 bytes at once: CRC=8AC52C35, 0.240s, 2133.008 MB/s 16 bytes at once: CRC=8AC52C35, 0.188s, 2718.889 MB/s 16 bytes at once: CRC=8AC52C35, 0.186s, 2751.569 MB/s (including prefetching) chunked : CRC=8AC52C35, 0.185s, 2762.833 MB/s
这是clang llvm 3.9.0的:
Please wait ... bitwise : CRC=8AC52C35, 3.899s, 131.328 MB/s half-byte : CRC=8AC52C35, 2.470s, 207.290 MB/s 1 byte at once: CRC=8AC52C35, 1.080s, 474.236 MB/s 4 bytes at once: CRC=8AC52C35, 0.482s, 1063.224 MB/s 8 bytes at once: CRC=8AC52C35, 0.306s, 1672.723 MB/s 4x8 bytes at once: CRC=8AC52C35, 0.241s, 2124.406 MB/s 16 bytes at once: CRC=8AC52C35, 0.161s, 3184.907 MB/s 16 bytes at once: CRC=8AC52C35, 0.158s, 3236.586 MB/s (including prefetching) chunked : CRC=8AC52C35, 0.157s, 3270.657 MB/s
某些情况下,ICC确实不如VS 2015,但是总体上来说提升还是很大的。
能力值:
( LV2,RANK:10 )
5 楼
都没见开 intel sse 3
intel 最牛b就是在自己CPU优化了
要说平衡的话,还是 vs好,生成的代码比较均匀
intel 就极端了
能力值:
( LV3,RANK:20 )
6 楼
非常感谢.不错
能力值:
( LV5,RANK:60 )
7 楼
300K9s2c8@1M7q4)9K6b7g2)9J5c8W2)9J5c8X3c8D9i4K6u0W2k6r3!0%4L8X3I4G2j5h3c8D9P5g2)9J5k6h3W2J5i4K6u0r3c8X3W2D9k6i4y4Q4x3V1k6e0L8$3k6@1N6$3q4J5k6g2)9J5c8V1W2F1N6r3g2D9i4K6g2X3f1r3q4J5j5h3I4D9k6h3I4Q4y4h3k6e0N6s2g2V1K9h3!0Q4y4h3k6j5c8g2)9#2k6U0t1H3x3e0k6Q4y4h3k6g2M7r3c8S2N6r3f1@1i4K6g2X3c8r3!0%4L8X3I4G2j5h3c8D9P5g2)9J5k6h3W2J5i4K6u0W2M7r3q4J5N6o6q4Q4x3X3g2J5j5i4t1`.
f02K9s2c8@1M7q4)9K6b7g2)9J5c8W2)9J5c8X3c8D9i4K6u0W2k6r3!0%4L8X3I4G2j5h3c8D9P5g2)9J5k6h3W2J5i4K6u0r3c8X3W2D9k6i4y4Q4x3V1k6e0L8$3k6@1N6$3q4J5k6g2)9J5c8V1W2F1N6r3g2D9i4K6g2X3f1r3q4J5j5h3I4D9k6h3I4Q4y4h3k6e0N6s2g2V1K9h3!0Q4y4h3k6j5c8g2)9#2k6U0t1H3x3e0k6Q4y4h3k6g2M7r3c8S2N6r3f1@1i4K6g2X3c8r3!0%4L8X3I4G2j5h3c8D9P5g2)9J5k6h3W2J5i4K6u0W2M7r3q4J5N6o6u0Q4x3X3g2J5j5i4t1`.
b79K9s2c8@1M7q4)9K6b7g2)9J5c8W2)9J5c8X3c8D9i4K6u0W2k6r3!0%4L8X3I4G2j5h3c8D9P5g2)9J5k6h3W2J5i4K6u0r3c8X3W2D9k6i4y4Q4x3V1k6e0L8$3k6@1N6$3q4J5k6g2)9J5c8V1W2F1N6r3g2D9i4K6g2X3f1r3q4J5j5h3I4D9k6h3I4Q4y4h3k6e0N6s2g2V1K9h3!0Q4y4h3k6j5c8g2)9#2k6U0t1H3x3e0k6Q4y4h3k6g2M7r3c8S2N6r3f1@1i4K6g2X3c8r3!0%4L8X3I4G2j5h3c8D9P5g2)9J5k6h3W2J5i4K6u0W2M7r3q4J5N6o6y4Q4x3X3g2J5j5i4t1`.
能力值:
( LV2,RANK:10 )
8 楼
207K9s2c8@1M7q4)9K6b7g2)9J5c8W2)9J5c8Y4u0W2k6$3W2K6N6s2u0S2N6r3W2G2L8X3y4W2L8Y4c8W2M7W2)9J5k6r3c8G2N6$3&6D9L8$3q4V1i4K6u0W2K9h3&6@1k6h3I4Q4x3X3g2U0L8$3#2Q4x3V1k6S2K9$3c8D9L8g2)9J5c8X3W2J5j5#2)9#2k6X3&6S2M7#2)9J5c8Y4c8W2j5#2)9J5c8U0V1&6y4e0g2Q4x3V1k6H3j5i4u0S2L8r3I4W2L8q4)9#2k6Y4y4@1N6h3c8A6L8#2)9#2k6Y4S2W2i4K6g2X3x3U0l9I4y4#2)9#2k6Y4g2H3k6r3q4@1k6e0q4Q4y4h3k6K6k6i4c8#2M7q4)9J5k6h3g2^5k6b7`.`.
直接上新的吧,看你们找得那么累
能力值:
( LV2,RANK:10 )
9 楼
With Update4: 551K9s2c8@1M7q4)9K6b7g2)9J5c8W2)9J5c8Y4u0W2k6$3W2K6N6s2u0S2N6r3W2G2L8X3y4W2L8Y4c8W2M7W2)9J5k6r3c8G2N6$3&6D9L8$3q4V1i4K6u0W2K9h3&6@1k6h3I4Q4x3X3g2U0L8$3#2Q4x3V1k6S2K9$3c8D9L8g2)9J5c8X3W2J5j5#2)9#2k6X3&6S2M7#2)9J5c8U0V1%4y4K6m8Q4x3V1k6H3j5i4u0S2L8r3I4W2L8q4)9#2k6Y4y4@1N6h3c8A6L8#2)9#2k6Y4S2W2i4K6g2X3x3U0l9I4y4W2)9#2k6Y4g2H3k6r3q4@1k6e0c8Q4y4h3k6K6k6i4c8#2M7q4)9J5k6h3g2^5k6b7`.`.
能力值:
( LV12,RANK:340 )
10 楼
thanks
能力值:
( LV12,RANK:340 )
11 楼
thanks
能力值:
( LV2,RANK:10 )
12 楼
这个 2017 sp1 解决了 不引用函数,编译进去问题了
看更新列表就说了 BUG FIX
应该是其中一个了
论坛KEY可以用
能力值:
( LV5,RANK:70 )
13 楼
核心未拥有
这个 2017 sp1 解决了 不引用函数,编译进去问题了
看更新列表就说了 BUG FIX
应该是其中一个了
论坛KEY可以用
3q,看了几篇ntel® Parallel Studio XE的文章,了解了2017 update1 问题算比较好的版本了,所以决定用他了
能力值:
( LV2,RANK:10 )
14 楼
d57K9s2c8@1M7q4)9K6b7g2)9J5c8W2)9J5c8Y4u0W2k6$3W2K6N6s2u0S2N6r3W2G2L8X3y4W2L8Y4c8W2M7W2)9J5k6r3c8G2N6$3&6D9L8$3q4V1i4K6u0W2K9h3&6@1k6h3I4Q4x3X3g2U0L8$3#2Q4x3V1k6S2K9$3c8D9L8g2)9J5c8X3W2J5j5#2)9#2k6X3&6S2M7#2)9J5c8U0V1%4y4K6m8Q4x3V1k6H3j5i4u0S2L8r3I4W2L8q4)9#2k6Y4y4@1N6h3c8A6L8#2)9#2k6Y4S2W2i4K6g2X3x3U0l9I4y4W2)9#2k6Y4g2H3k6r3q4@1k6e0c8Q4y4h3k6K6k6i4c8#2M7q4)9J5k6h3g2^5k6b7`.`.
我下载的这个版本 安装的时候 没有让我输入lic 倒是让我输入了 序列号 请问是怎么回事 第一次用呢 请大神指教。。。
能力值:
( LV6,RANK:90 )
15 楼
thanks