Closed
Conversation
ChangeLog: * simd.h: Disable construction from a contiguous range.
ChangeLog: * simd.h: Use to_address(it) instead of addressof(*it)
ChangeLog: * simd.h (simd::operator<<): New. (simd::operator>>): New.
ChangeLog: * simd_reductions.h (reduce): New masked overloads. Constrain _BinaryOperation to avoid ambiguity with masked overload.
ChangeLog: * simd.h (simd::copy_to): Implement masked store without use of the std::experimental base types.
ChangeLog: * detail.h (__detail::__arithmetic): Moved to simd_abi.h. (__detail::__vectorizable): Likewise. * simd.h (simd::operator[]): Implement directly (without calling into _Base). Add special case for _AbiArray. * simd_abi.h (__detail::_AbiArray): New. (__detail::_SimdImplArray): New. (__detail::_MaskImplArray): New (stub). (__detail::_DeduceAbi): Prefer _AbiArray over _AbiCombine.
ChangeLog: * permute.h (simd_permute): Mark helper lambda as always_inline.
ChangeLog: * constexpr_tests.c++: Test __static_range_size. * detail.h (__detail::__static_range_size): Remove constraint, fix SFINAE, and handle C-arrays.
New shorthand __pv2 for std::experimental::parallelism_v2 and std::experimental::parallelism_v2::__proposed. ChangeLog: * Makefile: Fix check-skylake-avx512 target name. * constexpr_tests.c++: Add new tests. Replace __detail with __pv2 scope. * constexpr_wrapper.h: New file. Copied from vir-simd. Add literals. * detail.h: Move __arithmetic and __vectorizable from simd_abi.h. Add __pv2 qualification. * fwddecl.h: Define __pv2 namespace. Declare basic_simd and basic_simd_mask. Declare mask and simd reductions, simd_split, and simd_cat. * mask_reductions.h: Use __pv2 instead of __detail. * simd_split.h: Likewise. * simd_reductions.h: Likewise. Remove defaults that are now in fwddecl.h. * simd.h: Use __pv2 instead of __detail. Don't inherit stdx::simd anymore. (operator[]): Complete range check/assume. Add new case for array _M_data. (_M_is_constprop): Add case for _Impl::_S_is_constprop. * simd_abi.h: Use __pv2 instead of __detail. More complete _AbiArray and implementation. Constrain vectorizable template parameters. Pass arrays by const-ref. (_SimdImplArray::_S_masked_assign): New. (_SimdImplArray::_S_is_constprop): New. (_MaskImplArray): New. (_SimdTupleMeta): New. (_SimdTupleData): New. (_SimdTuple): New. (_SimdImplAbiCombine): New. (_MaskImplAbiCombine::_S_generator): New. * simd_mask.h: Use __pv2 instead of __detail. (operator[]): Complete range check/assume. Add new case for array _M_data.
ChangeLog: * Makefile: Add -fconcepts-diagnostics-depth=3. * constexpr_tests.c++: Test for random_access_range and not output_range. * simd.h (simd::begin, simd::end): New. * simd_iterator.h: New file. * simd_mask.h (simd_mask::begin, simd_mask::end): New.
ChangeLog: * constexpr_wrapper.h:
ChangeLog: * detail.h:
ChangeLog: * detail.h:
ChangeLog: * simd.h:
ChangeLog: * loads_and_stores/ce.cpp:
Returns an unspecified value if none_of(mask) is true. ChangeLog: * detail.h: * mask_reductions.h: * mask_reductions/ce.cpp: New file.
ChangeLog: * fwddecl.h: * mask_reductions.h: * mask_reductions/ce.cpp:
ChangeLog: * mask_reductions.h: * simd_abi.h:
ChangeLog: * simd.h:
Update constexpr_wrapper from vir-simd repo. ChangeLog: * constexpr_wrapper.h:
ChangeLog: * fwddecl.h (simd_alignment, simd_alignment_v): New. * simd.h (simd_alignment): Partial specializations for basic_simd and basic_simd_mask.
ChangeLog: * detail.h:
Copied and modified large parts of the vec-builtin and x86 implementation in <experimental/simd>. Branching on CPU features now uses conditions derived from template parameters, not globals. This should make it easier to adopt multi-arch / multi-veclen compilation at some point (once the compiler supports it). Reduced the number of template and lambda instantiations, to reduce compile time and space requirements. Notably, iterations on index spaces now use a single lambda instead of a function template invoking one lambda specialization per index. Also removing the use of std::invoke has a *huge* impact. ChangeLog: * Makefile: Add more tests and compile them in many different configurations. * arm_detail.h: New file. * constexpr_tests.c++: Add another test for size 7. * constexpr_wrapper.h: Update copyright. * detail.h: Copy and adjust several functions from experimental/simd. Add _FloatingPointFlags, _MachineFlags, and _BuildFlags / __build_flags. Explicitly list the vectorizable types. Preliminary support for std::float(16|32|64)_t. * detail_bitmask.h: New file. * flags.h: Add TS-like load/store flags. * fwddecl.h: Remove TS usage. Add _VecAbi, _Avx512Abi, and _ScalarAbi. * interleave.h: Update copyright. * iota.h: Update copyright. Reduce lambda instantiations. * mask_reductions.h: Move x86 specific code into simd_x86.h. Call into Abi implementation types when available. * permute.h: Update copyright. Replace always_inline macro. * power_detail.h: New file. * simd.h: Add mossing unary operators +, -, ~, ++, --. Constrain all operators on using the same operator on the value-type. * simd_abi.h: Implement ABI tag deduction. * simd_builtin.h: New file. * simd_config.h: New file. * simd_converter.h: New file. * simd_iterator.h: Update copyright. * simd_mask.h: Remove base class and implement ABI-specific / implementation-defined conversions. * simd_reductions.h: Move x86 optimization into simd_x86.h. * simd_scalar.h: New file. * simd_split.h: Remove <experimental/simd> dependency. * simd_x86.h: New file. * tests/misc.cpp: New file. * tests/shift_left.cpp: New file. * tests/shift_right.cpp: New file. * unittest.h: New file. * x86_detail.h: Add _MachineFlags, __x86_builtin_int, __to_x86_intrin, and __movmsk.
ChangeLog: * Makefile: Generate all check targets and help targets without using a shell using the foreach and eval make functions. Call a submake in the check recipes with a randomized set of concrete checks to run.
Change _VecAbi and _Avx512Abi to use number of elements as template parameter instead of number of bytes. This simplifies _S_size, _S_full_size, and _S_is_partial, which don't need to be templates anymore. More importantly, it removes the need for passing the value-type as a template parameter to some of the Impl functions. Remove unused _SimdBase and _MaskBase. Have _SimdTraits depend on build flags, adding a new _SimdMaskTraits to work out the right ABI for AVX w/o AVX2. ChangeLog: * arm_detail.h: Include detail.h. * constexpr_tests.c++: Add sanity checks relating to AVX w/o AVX2. * detail.h (__make_dependent): New. (_SimdTraits): Add __build_flags template argument. (_SimdMaskTraits): New. * fwddecl.h (_VecAbi, _Avx512Abi): Change template parameter name. (__native_abi_impl_recursive): Adjust for change from bytes to width. (_DeduceAbi): Move default definition to simd_abi.h. * power_detail.h: Include detail.h. * simd.h (simd::mask_type): Test for void not vectorizable to document intent. (basic_simd(basic_simd_mask) deduction guide): Defer ABI tag deduction to __simd_abi_for_mask trait. * simd_abi.h: Adjust for ABI tag template parameter change from bytes to width. (_AbiCombine): Move from std::__detail to std namespace. (_SimdImplArray::_S_masked_assign): Handle some AVX w/o AVX2 cases. * simd_builtin.h: Adjust for ABI tag template parameter change from bytes to width. * simd_mask.h (__simd_abi_for_mask): New. * simd_scalar.h: Remove template heads from _S_size, _S_full_size, and _S_is_partial. Remove _SimdBase, _MaskBase. * simd_x86.h: Adjust for ABI tag template parameter change. (_SimdMaskTraits): Specialize for the AVX w/o AVX2 case. (_ImplBuiltin::_S_load): Overload for AVX512 bitmasks. (_ImplBuiltin::_S_select_bitmask): Swap argument names. Enable constexpr eval. (_ImplBuiltin::_S_select): Enable constexpr eval. Disambiguate overloads when _MaskMember<_TV> is a bitmask type. (_ImplBuiltin::_S_masked_assign): Broadcast scalar argument to a vector when calling _S_select_bitmask. (_ImplBuiltin::_S_bit_and, _S_bit_or, _S_bit_xor, _S_to_bits): Overload for bitmasks. (_ImplBuiltin::_S_bit_shift_right, _S_bit_shift_left): Replace several reinterpret_cast with __vec_bitcast_trunc on return. Adjust ABI/Impl type needed after template parameter change. Fix conditions for sizeof<16 inputs. (_ImplBuiltin::_S_ldexp): Use __make_dependent to instantiate _Rebind only on use of _S_ldexp. Start of supporting sizeof<16 inputs. * unittest.h (instantiate_tests_for_value_type): Sanity check that if simd<T, N> is usable, then the corresponding mask is also usable. * x86_detail.h (__x86_builtin_fp): New. (__to_x86_intrin): Normalize floating-point types using __x86_builtin_fp.
60k check targets: - 0.6s for 'make help', listing all 60k targets - 0.5s for 'make debug', parsing the whole Makefile and a bit of output All check targets are shuffled differently on every make invocation without significant overhead. The 'check' target works without sub-make, whereas all the other check-% targets recurs once (which might become a problem with too long command lines). Per -march, a header is generated, a PCH is built from it, and the header is automatically included into the builds. ChangeLog: * Makefile: Rewrite.
ChangeLog: * Makefile: Remove stale help/% target. Accommodate new unittest.h location. Remove -I. flag. Fix required check target. * tests/misc.cpp: Adjust unittest.h include. * tests/shift_left.cpp: Likewise. * tests/shift_right.cpp: Likewise. * unittest.h: Renamed to unittest.h.
ChangeLog: * Makefile: Require 0 failed tests or fail with non-zero exit status.
Use icerun for running tests to enable -j<large number>. ChangeLog: * Makefile: Run tests in icerun. Compile and link in one step if DIRECT is non-empty. Document DIRECT in help target. Unconditionally set DIRECT=1 without icecream. Build without icecream wrapper but with icerun when DIRECT is non-empty.
ChangeLog: * Makefile: Adjust for unittest.h changes. * tests/misc.cpp: Replace main() with register_tests call. * tests/shift_left.cpp: Likewise. * tests/shift_right.cpp: Likewise. * tests/unittest.h: Move into unittest_pch.h, keeping only instantiate_tests_for_value_type. Change how test functions are registered and invoked. * tests/unittest_pch.h: New file.
ChangeLog: * README.md: Document iota, permute, and simd_mask <-> bitset/int paper implementation status. * bits/fwddecl.h (zero_element, uninit_element): New. * bits/permute.h (permute_zero): Remove. (permute): Refactor to call _S_permute. * bits/simd_builtin.h (_S_permute): New. * bits/simd_meta.h (__index_permutation_function_size): Change size argument type to int. * bits/simd_scalar.h (_S_permute): New.
ChangeLog: * bits/fwddecl.h: Move math exposition-only concepts and traits here from simd_math.h for declarations of math functions. (isfinite, isunordered): Declare. * bits/simd_math.h (isfinite, isunordered): New. * bits/mask_reductions.h (all_of, any_of, none_of): Call ABI implementation functions with _M_data member instead of basic_simd_mask object. * bits/simd_abi.h (_S_any_of, _S_all_of, _S_none_of): Change interface from basic_simd_mask to _MaskMember. * bits/simd_builtin.h (_S_any_of, _S_all_of, _S_none_of): Likewise. * bits/simd_scalar.h (_S_any_of, _S_all_of, _S_none_of): Likewise. * bits/simd_x86.h (_S_any_of, _S_all_of, _S_none_of): Likewise. (_S_divides): Use masked divp[hsd] with AVX10/AVX512 on partial vectors. (_S_isnan): Implement consteval/constprop variant for _Avx512Abi.
ChangeLog: * bits/fwddecl.h (__deduce_t): Use __canonical_vec_type to reduce the number of possible _DeduceAbi specializations. Add _PrefAbi argument. * bits/vec_detail.h (__canonical_vec_type): Moved to fwddecl.h. * bits/simd.h (rebind, resize): Determine new ABI tag using _Rebind on ingoing ABI tag. * bits/simd_abi.h (_AbiList::_S_A0_is_valid): Take _S_defer_to_scalar_abi into consideration. (_Rebind): New. (_AbiMaxSize): New. (__is_valid_preferred_abi): New. (_DeduceAbi): Add _PrefAbi argument. Change existing partial specializations to trigger on _NoAbiPreference. Add partial specialization for given _PrefAbi. * bits/simd_builtin.h (_S_defer_to_scalar_abi): New. (_Rebind): Simplify to __deduce_t with _VecAbi. * bits/simd_scalar.h (_S_defer_to_scalar_abi, _Rebind): New. * bits/simd_x86.h (_S_defer_to_scalar_abi): New. (_Rebind): Simplify to __deduce_t with _Avx512Abi. * constexpr_tests.c++: Add _AbiMaxSize and ABI deduction tests.
ChangeLog: * bits/simd.h: Add a reason to deleted functions. * bits/simd_mask.h: Likewise.
ChangeLog: * bits/simd_abi.h (_SimdTuple): Change integral_constants into plain constexpr ints.
Recently __has_single_bit, __bit_ceil, __bit_floor, etc. have started to require unsigned integers, rejecting int. Therefore, cast to unsigned where it was called with a signed int. Fixes gh-1 ChangeLog: * bits/detail.h: * bits/mask_reductions.h: * bits/simd_abi.h: * bits/simd_builtin.h: * bits/simd_converter.h: * bits/simd_reductions.h: * bits/simd_x86.h: * bits/vec_detail.h (__glibcxx_simd_erroneous_unless): New. (__is_power2_minus_1): Moved from detail.h and rewritten to not use bit functions. (__signed_has_single_bit, __signed_bit_ceil): New. (__signed_bit_floor): New. * constexpr_tests.c++:
ChangeLog: * Makefile.common: Pass -p to mkdir.
ChangeLog: * tests/unittest_pch.h: Avoid wrap-around to negative from numeric_limits<long long>::max().
ChangeLog: * bits/simd.h: Use _GLIBCXX_DELETE_MSG instead of delete. * bits/simd_mask.h: Likewise.
This should make the warning go away for users that have warnings in system headers enabled. ChangeLog: * Makefile.common: Remove -Wno-psabi from CXXFLAGS. * bits/simd_x86.h: Ignore -Wpsabi diagnostic. * bits/vec_detail.h: Likewise.
ChangeLog: * .github/workflows/Clang.yml: Run on cplusplus-ci:latest. * .github/workflows/GCC.yml: Likewise. * .github/workflows/build-ci-docker.yml: New file. * Dockerfile: New file.
ChangeLog: * .github/workflows/Clang.yml: Adjust container image URL. * .github/workflows/GCC.yml: Likewise. * .github/workflows/build-ci-docker.yml: Removed. * Dockerfile: Removed.
ChangeLog: * .github/workflows/GCC.yml:
ChangeLog: * Makefile: * README.md:
ChangeLog: * Makefile.common: Determine obj directory from SIMD_OBJ_SUBST if defined.
ChangeLog: * Makefile:
ChangeLog: * bits/simd_x86.h (_S_divides): Add initial branch for consteval and constprop => never call __builtin_ia32_... in those cases.
ChangeLog: * bits/simd_builtin.h (_S_permute): Rewrite using integral_constants instead of plain int.
ChangeLog: * Makefile.common: Replace obj with $(objdir) * Makefile.more: Ditto. * Makefile: Ditto.
ChangeLog: * Makefile.common: * Makefile:
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v4...v5) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [fsfe/reuse-action](https://github.com/fsfe/reuse-action) from 5 to 6. - [Release notes](https://github.com/fsfe/reuse-action/releases) - [Commits](fsfe/reuse-action@v5...v6) --- updated-dependencies: - dependency-name: fsfe/reuse-action dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>
Contributor
Author
|
OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting If you change your mind, just re-open this PR and I'll resolve any conflicts on it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bumps fsfe/reuse-action from 5 to 6.
Release notes
Sourced from fsfe/reuse-action's releases.
Commits
676e2d5Bump to reuse v6Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot mergewill merge this PR after your CI passes on it@dependabot squash and mergewill squash and merge this PR after your CI passes on it@dependabot cancel mergewill cancel a previously requested merge and block automerging@dependabot reopenwill reopen this PR if it is closed@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)