mirrored from git://gcc.gnu.org/git/gcc.git
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Add <span> header #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Implements <span> from C++2a. This header provides std::span and all associated features in <span> as described by https://en.cppreference.com/w/cpp/header/span.
Send this to [email protected]. |
This is an unofficial mirror that nobody from the GCC project is involved with. Sending pull requests here is a waste of time. Please see https://gcc.gnu.org/contribute.html for how to contribute to GCC, thanks. |
nstester
pushed a commit
to nstester/gcc
that referenced
this pull request
Aug 19, 2021
As suggested in the PR, the following patch adds two new clrsb expansion possibilities if target doesn't have clrsb_optab for the requested nor wider modes, but does have clz_optab for the requested mode. One expansion is clrsb (op0) expands as clz (op0 ^ (((stype)op0) >> (prec-1))) - 1 which is usable if CLZ_DEFINED_VALUE_AT_ZERO is 2 with value of prec, because the clz argument can be 0 and clrsb should give prec-1 in that case. The other expansion is clz (((op0 << 1) ^ (((stype)op0) >> (prec-1))) | 1) where the clz argument is never 0, but it is one operation longer. E.g. on x86_64-linux with -O2 -mno-lzcnt, this results for int foo (int x) { return __builtin_clrsb (x); } in - subq $8, %rsp - movslq %edi, %rdi - call __clrsbdi2 - addq $8, %rsp - subl $32, %eax + leal (%rdi,%rdi), %eax + sarl $31, %edi + xorl %edi, %eax + orl $1, %eax + bsrl %eax, %eax + xorl $31, %eax and with -O2 -mlzcnt: + movl %edi, %eax + sarl $31, %eax + xorl %edi, %eax + lzcntl %eax, %eax + subl $1, %eax On armv7hl-linux-gnueabi with -O2: - push {r4, lr} - bl __clrsbsi2 - pop {r4, pc} + @ link register save eliminated. + eor r0, r0, r0, asr gcc-mirror#31 + clz r0, r0 + sub r0, r0, #1 + bx lr As it (at least usually) will make code larger, it is disabled for -Os or cold instructions. 2021-08-19 Jakub Jelinek <[email protected]> PR middle-end/101950 * optabs.c (expand_clrsb_using_clz): New function. (expand_unop): Use it as another clrsb expansion fallback. * gcc.target/i386/pr101950-1.c: New test. * gcc.target/i386/pr101950-2.c: New test.
xionghul
pushed a commit
to xionghul/gcc
that referenced
this pull request
Jan 16, 2023
This recognises the patterns of the form: while (n & 1) { n >>= 1 } Unfortunately there are currently two issues relating to this patch. Firstly, simplify_using_initial_conditions does not recognise that (n != 0) and ((n & 1) == 0) implies that ((n >> 1) != 0). This preconditions arise following the loop copy-header pass, and the assumptions returned by number_of_iterations_exit_assumptions then prevent final value replacement from using the niter result. I'm not sure what is the best way to fix this - one approach could be to modify simplify_using_initial_conditions to handle this sort of case, but it seems that it basically wants the information that ranger could give anway, so would something like that be a better option? The second issue arises in the vectoriser, which is able to determine that the niter->assumptions are always true. When building with -march=armv8.4-a+sve -S -O3, we get this codegen: foo (unsigned int b) { int c = 0; if (b == 0) return PREC; while (!(b & (1 << (PREC - 1)))) { b <<= 1; c++; } return c; } foo: .LFB0: .cfi_startproc cmp w0, 0 cbz w0, .L6 blt .L7 lsl w1, w0, 1 clz w2, w1 cmp w2, 14 bls .L8 mov x0, 0 cntw x3 add w1, w2, 1 index z1.s, #0, gcc-mirror#1 whilelo p0.s, wzr, w1 .L4: add x0, x0, x3 mov p1.b, p0.b mov z0.d, z1.d whilelo p0.s, w0, w1 incw z1.s b.any .L4 add z0.s, z0.s, gcc-mirror#1 lastb w0, p1, z0.s ret .p2align 2,,3 .L8: mov w0, 0 b .L3 .p2align 2,,3 .L13: lsl w1, w1, 1 .L3: add w0, w0, 1 tbz w1, gcc-mirror#31, .L13 ret .p2align 2,,3 .L6: mov w0, 32 ret .p2align 2,,3 .L7: mov w0, 0 ret .cfi_endproc In essence, the vectoriser uses the niter information to determine exactly how many iterations of the loop it needs to run. It then uses SVE whilelo instructions to run this number of iterations. The original loop counter is also vectorised, despite only being used in the final iteration, and then the final value of this counter is used as the return value (which is the same as the number of iterations it computed in the first place). This vectorisation is obviously bad, and I think it exposes a latent bug in the vectoriser, rather than being an issue caused by this specific patch. gcc/ChangeLog: * tree-ssa-loop-niter.cc (number_of_iterations_cltz): New. (number_of_iterations_bitcount): Add call to the above. (number_of_iterations_exit_assumptions): Add EQ_EXPR case for c[lt]z idiom recognition. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/cltz-max.c: New test. * gcc.dg/tree-ssa/clz-char.c: New test. * gcc.dg/tree-ssa/clz-int.c: New test. * gcc.dg/tree-ssa/clz-long-long.c: New test. * gcc.dg/tree-ssa/clz-long.c: New test. * gcc.dg/tree-ssa/ctz-char.c: New test. * gcc.dg/tree-ssa/ctz-int.c: New test. * gcc.dg/tree-ssa/ctz-long-long.c: New test. * gcc.dg/tree-ssa/ctz-long.c: New test.
XYenChi
pushed a commit
to XYenChi/gcc
that referenced
this pull request
Mar 6, 2024
This recognises the patterns of the form: while (n & 1) { n >>= 1 } Unfortunately there are currently two issues relating to this patch. Firstly, simplify_using_initial_conditions does not recognise that (n != 0) and ((n & 1) == 0) implies that ((n >> 1) != 0). This preconditions arise following the loop copy-header pass, and the assumptions returned by number_of_iterations_exit_assumptions then prevent final value replacement from using the niter result. I'm not sure what is the best way to fix this - one approach could be to modify simplify_using_initial_conditions to handle this sort of case, but it seems that it basically wants the information that ranger could give anway, so would something like that be a better option? The second issue arises in the vectoriser, which is able to determine that the niter->assumptions are always true. When building with -march=armv8.4-a+sve -S -O3, we get this codegen: foo (unsigned int b) { int c = 0; if (b == 0) return PREC; while (!(b & (1 << (PREC - 1)))) { b <<= 1; c++; } return c; } foo: .LFB0: .cfi_startproc cmp w0, 0 cbz w0, .L6 blt .L7 lsl w1, w0, 1 clz w2, w1 cmp w2, 14 bls .L8 mov x0, 0 cntw x3 add w1, w2, 1 index z1.s, #0, #1 whilelo p0.s, wzr, w1 .L4: add x0, x0, x3 mov p1.b, p0.b mov z0.d, z1.d whilelo p0.s, w0, w1 incw z1.s b.any .L4 add z0.s, z0.s, #1 lastb w0, p1, z0.s ret .p2align 2,,3 .L8: mov w0, 0 b .L3 .p2align 2,,3 .L13: lsl w1, w1, 1 .L3: add w0, w0, 1 tbz w1, gcc-mirror#31, .L13 ret .p2align 2,,3 .L6: mov w0, 32 ret .p2align 2,,3 .L7: mov w0, 0 ret .cfi_endproc In essence, the vectoriser uses the niter information to determine exactly how many iterations of the loop it needs to run. It then uses SVE whilelo instructions to run this number of iterations. The original loop counter is also vectorised, despite only being used in the final iteration, and then the final value of this counter is used as the return value (which is the same as the number of iterations it computed in the first place). This vectorisation is obviously bad, and I think it exposes a latent bug in the vectoriser, rather than being an issue caused by this specific patch. gcc/ChangeLog: * tree-ssa-loop-niter.cc (number_of_iterations_cltz): New. (number_of_iterations_bitcount): Add call to the above. (number_of_iterations_exit_assumptions): Add EQ_EXPR case for c[lt]z idiom recognition. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/cltz-max.c: New test. * gcc.dg/tree-ssa/clz-char.c: New test. * gcc.dg/tree-ssa/clz-int.c: New test. * gcc.dg/tree-ssa/clz-long-long.c: New test. * gcc.dg/tree-ssa/clz-long.c: New test. * gcc.dg/tree-ssa/ctz-char.c: New test. * gcc.dg/tree-ssa/ctz-int.c: New test. * gcc.dg/tree-ssa/ctz-long-long.c: New test. * gcc.dg/tree-ssa/ctz-long.c: New test.
cooljeanius
referenced
this pull request
in cooljeanius/gcc
Sep 20, 2024
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
cooljeanius
referenced
this pull request
in cooljeanius/gcc
Sep 20, 2024
Fix code scanning alert #31: Incorrect conversion between integer types
iains
referenced
this pull request
in NinaRanns/gcc
Oct 29, 2024
…-opt Contracts nonattr add eval semantic opt
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Implements <span> from C++2a. This header provides std::span and all associated features in <span> as described by https://en.cppreference.com/w/cpp/header/span. The Pull Request is intended to add these to libstd++ which is currently not provided by GCC.
This header comes with the guarentee that it follows the specification described in the above resource. It is not guarenteed to be Complient with the latest version of the C++2a standard working draft, or any later version, should the header and/or class specification change.