Monday, July 27, 2015

.htaccess - htaccess rule to encode only some captured group

My current rule is



RewriteRule ^data/(v[0-9]\.[0-9]\.?[0-9]?)/.*$ http://35.231.131.100:5000/cocoon_$1?subject=https://w3id.org/cocoon/$0 [L,NE,QSA,R=308]


It will convert




https://w3id.org/cocoon/data/v1.0.1/2019-03-07/CloudStorageTransactionsPriceSpecification/Azure/managed_disk/transactions-ssd



to



http://35.231.131.100:5000/cocoon_v1.0.1?subject=https://w3id.org/cocoon/data/v1.0.1/2019-03-07/CloudStorageTransactionsPriceSpecification/Azure/managed_disk/transactions-ssd



But for another example original URL



https://w3id.org/cocoon/data/v1.0.1/Measurement/DownlinkSpeed-1-128-KB/StorageService/Gcloud/150.203.213.249/lat=-35.271475/long=149.121434/2019-02-26T07%3A14%3A19.932Z/australia-southeast1




I need to encode the query string for subject=, i.e.



http://35.231.131.100:5000/cocoon_v1.0.1?subject=https%3A%2F%2Fw3id.org%2Fcocoon%2Fdata%2Fv1.0.1%2FMeasurement%2FDownlinkSpeed-1-128-KB%2FStorageService%2FGcloud%2F150.203.213.249%2Flat%3D-35.271475%2Flong%3D149.121434%2F2019-02-26T07%253A14%253A19.932Z%2Faustralia-southeast1



I'm currently using the NE flag, for not escaping $1, i.e. v1.0.1.



How do I encode the https://w3id.org/cocoon/$0 part?



Some reasons behind all this: It is the : in the date time part of the URL stopped the page from working, encoding it individually to %3A doesn't work, so I'm encoding the whole subject= part.







Edit



Rules suggested by MrWhite, doesn't quite work.



RewriteCond %{THE_REQUEST} [a-z]{3,5}\s.*?/(data/(v[0-9]\.[0-9]\.?[0-9]?)/.*)\s [NC]
RewriteRule ^data/(v[0-9]\.[0-9]\.?[0-9]?)/.* http://35.231.131.100:5000/cocoon_$1?subject=https\%3A\%2F\%2Fw3id.org\%2Fcocoon\%2F%1 [L,NE,QSA,R=308]



I tested with



curl http://localhost/cocoon/data/v1.0.1/Measurement/DownlinkSpeed-1-128Gcloud/150.203.213.249/lat=-35.271475/long=149.121434/2019-02-26T07%3A14%3A19.932Z/australia-southeast1


It redirects to
http://35.231.131.100:5000/cocoon_v1.0.1?subject=https%3A%2F%2Fw3id.org%2Fcocoon%2Fdata/v1.0.1/Measurement/DownlinkSpeed-1-128-KB/StorageService/Gcloud/150.203.213.249/lat=-35.271475/long=149.121434/2019-02-26T07%3A14%3A19.932Z/australia-southeast1



This can't be recognized by my Linked Data Fragments server. The / isn't encoded. I think the subject doesn't take a partial encoded string. With : it has to be encoded, hence the whole subject string has to go with the encoding option.




And for B flag, I tested with B=/, it seems everthing get encoded twice? i.e. . to %252e and / to %252f?



And thank you for pointing out the unintentional trailing dot, I actually want v[0-9]\.[0-9](?:\.[0-9])?



I also tried the N flag, but couldn't get it right. It becames an infinite loop.



RewriteRule ^data/(v[0-9]\.[0-9]\.?[0-9]?)/([^/]+)/(.*) data/$1/$2\%2F$3 [N=20]
RewriteRule ^data/(v[0-9]\.[0-9]\.?[0-9]?)/.* http://35.231.131.100:5000/cocoon_$1?subject=https\%3A\%2F\%2Fw3id.org\%2Fcocoon\%2Fdata\%2F$1\%2F$3[L,NE,QSA,R=308]



I wanted [^/]+ to match anything not /, so I can replace all slash after version number to be the encoded value, added \ to escape the %2F.

No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...