Conversation
serge-sans-paille
left a comment
There was a problem hiding this comment.
Some early review, a few open questions but nothing looks bad here. Way to go!
| { | ||
| inline void get_cpuid(int reg[4], int level, int count = 0) noexcept; | ||
|
|
||
| inline std::uint32_t get_xcr0_low() noexcept; |
There was a problem hiding this comment.
it would be good to somehow ensure that this type is the same as ref_t
There was a problem hiding this comment.
And this could be a private static method for better encapsulation right?
| constexpr bool fma4() const noexcept | ||
| { | ||
| return utils::bit_is_set<16>(m_regs.reg8[2]); | ||
| } |
There was a problem hiding this comment.
Open question: should we have a macro that makes the generation of those field cleaner to the eye?
There was a problem hiding this comment.
Either way. I am personally not big on macros. I find it easier read this way (could also be made to fit on one line) and find&replace is quite enough a change is required.
|
|
||
| namespace detail | ||
| { | ||
| inline void get_cpuid(int reg[4], int level, int count) noexcept |
There was a problem hiding this comment.
Same here: could be private for better encapsulation.
There was a problem hiding this comment.
On the contrary, I was wondering whether to make it part of public API.
It's a well defined function, unlikely to change, and that could be useful for users to get features we do not expose (not sure we'll go to cover 100% of cpuid).
| * The full license is in the file LICENSE, distributed with this software. * | ||
| ****************************************************************************/ | ||
|
|
||
| #include "../config/xsimd_inline.hpp" |
There was a problem hiding this comment.
why do you need this?
There was a problem hiding this comment.
It for header including what they use (otherwise I get nasty errors in my editor).
But this one did not belong here but in xsimd_register.hpp where XSIMD_INLINE is used.
AntoinePrv
left a comment
There was a problem hiding this comment.
Comments on change may make relative to the previous implementation
| #elif defined(__INTEL_COMPILER) | ||
| __cpuid(reg.data(), level); |
There was a problem hiding this comment.
Should we change to use inline ASM for intel compiler? Missing the count option here.
There was a problem hiding this comment.
I assumle so, feel free to give it a try!
| #elif defined(_MSC_VER) && _MSC_VER >= 1400 | ||
| return static_cast<xcr0_reg_t>(_xgetbv(0)); | ||
|
|
||
| #elif defined(__GNUC__) |
There was a problem hiding this comment.
What about __clang__ and __INTEL_COMPILER? Should we reproduce the get_cpuid pattern?
There was a problem hiding this comment.
Yes, I think so. Also, IIRC, __clang__ comes with __GNUC__ defined too, and Intel compiler defines __GNUC__ or _MSC_VER depending on the platform. So we may simplify this logic (but first we need to verify that my asumption is right).
There was a problem hiding this comment.
It's true that clang defines __GNUC__. And it happens to be true for icx too, see https://godbolt.org/z/56j57vhnT
|
@serge-sans-paille what do you think of this? I wanted to add an |
| #elif defined(_MSC_VER) && _MSC_VER >= 1400 | ||
| return static_cast<xcr0_reg_t>(_xgetbv(0)); | ||
|
|
||
| #elif defined(__GNUC__) |
There was a problem hiding this comment.
Yes, I think so. Also, IIRC, __clang__ comes with __GNUC__ defined too, and Intel compiler defines __GNUC__ or _MSC_VER depending on the platform. So we may simplify this logic (but first we need to verify that my asumption is right).
|
@AntoinePrv please ping me when you need a review! |
|
@serge-sans-paille this is ready. This is intended to make a clean abstraction for CPUID.
|
| regs.reg1 = detail::get_cpuid(0x1); | ||
| regs.reg7 = detail::get_cpuid(0x7); | ||
| regs.reg7a = detail::get_cpuid(0x7, 0x1); | ||
| regs.reg8 = detail::get_cpuid(0x80000001); |
There was a problem hiding this comment.
Open question: could we avoid some get_cpuid call in some situation? For instance reg8 is only used for fma4, and reg7 for avx2 or later, which mean we could skip filling reg7 if we don't have avx, and fma4 if have fma3.
There was a problem hiding this comment.
Let's postpone this as a further optimization then?
| #elif defined(__INTEL_COMPILER) | ||
| __cpuid(reg.data(), level); |
There was a problem hiding this comment.
I assumle so, feel free to give it a try!
| #elif defined(_MSC_VER) && _MSC_VER >= 1400 | ||
| return static_cast<xcr0_reg_t>(_xgetbv(0)); | ||
|
|
||
| #elif defined(__GNUC__) |
There was a problem hiding this comment.
It's true that clang defines __GNUC__. And it happens to be true for icx too, see https://godbolt.org/z/56j57vhnT
|
@serge-sans-paille in 1/2 day I can push another commit (easily revertable) with things I've been trying lately. |
|
ok, ping me when you need an extra pair of eyes.
|
dd51e01 to
d63a75f
Compare
|
@serge-sans-paille with this last commit we have
It's more LOC (though there is more documentation now, and many empty classes / brace lines), but it's also more declarative from now on. We just need to add more members to the enums to read other things. If you like the design I'd be happy to add tests for all classes with know implications (e.g. |
This is a no-addition, non-breaking refactor of existing CPU id features as first class citizen as part of #1245.
supported_archkeeps the same structure and merges use of both class at the moment but both class could be combined a user-friendlyx86_cpu_features.